In 2000, computer researchers at Carnegie Mellon University started a project to fundamentally shift how search results are
organized. The idea behind the approach, called clustering, was to find meaningful connections among Internet search results
to speed and improve research.

Vivisimo Velocity 4.2
Vivísimo, vivisimo.com
|
Excellent 8.7 |
 |
| criteria |
score |
weight |
| Ease-of-use |
8 |
20% |
 |
| Integration |
9 |
20% |
 |
| Management |
9 |
20% |
 |
| Performance |
9 |
20% |
 |
| Scalability |
8 |
10% |
 |
| Value |
9 |
10% |
 |
|
 |
Cost: Starts at $10,000 per year
Platforms: Clustering and metasearch run on Windows, Linux, and Solaris; search engine module is available exclusively for Linux
Bottom Line: Combining dynamic clustering, search, and metasearch, Velocity can be deployed faster and less expensively than other search
solutions. Velocity simultaneously searches multiple information sources and presents the results in organized folders from
a single consistent interface. Additionally, the software will combine several URLs into a single result. Custom parsing makes
results more meaningful.
|
 |
About our Reviews and Scoring Methodology
|
|
|
|
During the next five years the resulting company, Vivísimo, commercialized the original clustering engine and extended its
offerings to include meta (federated) search and its own search engine. Called Velocity, this technology threesome should
have wide appeal. Academics, scientists, government analysts, market researchers, online publishers, and product managers
in any industry will benefit from Velocity 4.2, as they all must search through and make sense of large, diverse data sources.
Velocity specifically includes three components: Vivísimo Clustering Engine, which automatically categorizes search results
on the fly into meaningful hierarchical folders (it overlays any search or database query engine); Vivísimo Content Integrator,
for simultaneously querying multiple content sources -- such as search engines and databases -- in one step and combining
the retrieved information; and the Vivísimo Search Engine.
Enterprises will typically start with clustering and metasearch because most already have some type of search engine in place.
I tested all three components, however, on an Intel-based server running Red Hat Enterprise Linux 3.0.
Velocity is an especially deep product, as is reflected in the number of options available from the UI. But the UI may confuse
first-time users. For example, some menus are several layers deep and not always labeled intuitively. Vivísimo developers
are working with usability experts to improve this shortcoming.
Still, in the more important performance areas, Velocity delivers. To evaluate clustering I connected Velocity to an existing
Verity Ultraseek search engine. The process involves completing two Web forms, one that describes the XML output from the search engine and
the second that indicates how to parse the results. Although this does require knowledge of your original search implementation,
I had clustering running in approximately 30 minutes. Vivísimo has done an excellent job organizing results into clusters
by intelligently using words and phrases contained in the original searches.
Although Velocity didn’t have a specific setup for Ultraseek, there are clustering templates for other common enterprise search
engines, including the Google Search Appliance; these prepopulated forms should save administrators time and reduce setup errors.
Configuring Vivísimo’s search engine required little effort. I easily created a source by defining the starting URL of an
intranet Web site. Then I selected a few other options, such as the maximum link depth. Again, clustered results were very
precise; no fine-tuning was required.
Next, I used the built-in search to index documents on a file server and Microsoft SQL server database. Besides handling typical
file formats (Microsoft Office, PDF, e-mail archives, and Zip archives), the search engine crawls sources that require authentication,
such as a content management system. In the latter case the software correctly hid results from users not authorized to view
them.
With the basics done, I examined customization. Velocity offers just about all the control an enterprise requires. For example,
customized HTML parsing allows me to strip unnecessary markup from pages, including navigation and sidebar links. As a consequence,
result summaries were even cleaner and more relevant.