Thunderstone Search Appliance finds the needle
Enterprise Web searcher combines ease of use, powerful searching, and blazing speedFollow @infoworld
The Thunderstone Search Appliance is designed to be a nearly plug-and-play solution to enterprise Web searching. You’re supposed to be able to install the Linux-based 1U appliance in a rack, plug in a network cable and a power cable, and begin indexing. While the reality is a little more involved, Thunderstone is clearly onto something. I found that the process of getting the Search Appliance up and running was quick and easy, and that once running, it was a very powerful search appliance indeed.
Getting the Thunderstone appliance up and running is not complicated. You install it. You hook it up. You attach a keyboard and monitor to ports on the back to make the basic settings, such as the IP address and host name. After that, you do the rest using the Web-based management console. The primary settings are the URL of the Web site you want the appliance to index and the file types you want it to retrieve. You can also choose whether to exclude material such as CGI scripts and inline frames if you wish.
Once you’ve accomplished the basic setup tasks, you tell the appliance to “walk” through the Web site, and off it goes, periodically reporting progress as it proceeds. And that’s all there is to it. You can tweak the process using the management interface if you desire, and fire off new walks when you feel like it. You can also set the appliance to perform walks at set times (daily at 1:00 a.m., for example).
There’s little to say about the administrative interface. It’s plain, consisting of boxes you can fill in, boxes you can check, menus you can pull down, and buttons you can push. Functions are separated by tabs and links. If you’re even minimally experienced using Web software (a near certainty) you’ll prefer this interface over the busy wizards and user-friendly excesses of the competition.
Once the management is complete, your users will find Thunderstone easy and flexible. Basic searches operate just like everyone else’s engine. Enter what you want, and Thunderstone goes and looks for it. The advanced searches, however, are where Thunderstone really shines.
As is the case with most search engines, you can perform Boolean searches and you can match phrases, but Thunderstone goes beyond that. You can also specify the proximity of the entries, how close the match has to be, and what word forms are acceptable. You can rank results by word order, proximity, frequency, and where the search subject appears in the text. Once you’ve performed a search, you can search within those results for additional terms.
After you submit a query, Thunderstone searches its database and provides the responses almost instantly. The company claims the appliance can handle 1,000 queries a minute, and I found no reason to dispute this. Normally, the responses from Thunderstone are sorted by relevance and ranked according to your choices of ranking factors. You may also sort by date.
Unfortunately, sorting by date may not deliver the results you expect. Thunderstone uses the last modified date in the HTML file (usually the same as the file date) to determine the date for sorting. This means that if you post an article on a certain day, then make changes to the article, Thunderstone will find the date for sorting to be when you last modified it. The InfoWorld Web site includes date meta tags for the date an article was published. While Thunderstone can search for those meta tags, it can’t use them as the source for the date sort. A Thunderstone spokesperson said that date sorting will be improved in a future release.
While the price for the Thunderstone Search Appliance seems a little steep, the produce performs exactly as intended. It is easy to administer, easy to use, and very fast. More importantly, it excels at searching, especially when you use its advanced search capabilities. While it won’t figure out badly spelled words as Google does, and while the date sorting works differently from what you might expect, these are hardly major indictments for what I found to be an excellent product. This one is worth looking at.
InfoWorld CTO Chad Dickerson contributed to this review.