Free Newsletters
InfoWorld Daily

InfoWorld
Log-in | Register

Vivísimo Velocity brings structure to enterprise search

Search platform clusters data into useful, relevant categories

By Mike Heck
May 23, 2005
 

In 2000, computer researchers at Carnegie Mellon University started a project to fundamentally shift how search results are organized. The idea behind the approach, called clustering, was to find meaningful connections among Internet search results to speed and improve research.

Free IT resource

Hear how top CIOs turn change into a competitive advantage.

Sponsored by HP

Free IT resource

Attend the SOA Executive Forum: Breaking SOA Bottlenecks SOAExecForum.com/may2007

Sponsored by InfoWorld



Vivisimo Velocity 4.2

Vivísimo, vivisimo.com

Excellent  8.7
criteria score weight
Ease-of-use 8 20%
Integration 9 20%
Management 9 20%
Performance 9 20%
Scalability 8 10%
Value 9 10%

Cost:
Starts at $10,000 per year

Platforms:
Clustering and metasearch run on Windows, Linux, and Solaris; search engine module is available exclusively for Linux

Bottom Line:
Combining dynamic clustering, search, and metasearch, Velocity can be deployed faster and less expensively than other search solutions. Velocity simultaneously searches multiple information sources and presents the results in organized folders from a single consistent interface. Additionally, the software will combine several URLs into a single result. Custom parsing makes results more meaningful.

About our Reviews and Scoring Methodology

During the next five years the resulting company, Vivísimo, commercialized the original clustering engine and extended its offerings to include meta (federated) search and its own search engine. Called Velocity, this technology threesome should have wide appeal. Academics, scientists, government analysts, market researchers, online publishers, and product managers in any industry will benefit from Velocity 4.2, as they all must search through and make sense of large, diverse data sources.

Velocity specifically includes three components: Vivísimo Clustering Engine, which automatically categorizes search results on the fly into meaningful hierarchical folders (it overlays any search or database query engine); Vivísimo Content Integrator, for simultaneously querying multiple content sources -- such as search engines and databases -- in one step and combining the retrieved information; and the Vivísimo Search Engine.

Enterprises will typically start with clustering and metasearch because most already have some type of search engine in place. I tested all three components, however, on an Intel-based server running Red Hat Enterprise Linux 3.0.

Velocity is an especially deep product, as is reflected in the number of options available from the UI. But the UI may confuse first-time users. For example, some menus are several layers deep and not always labeled intuitively. Vivísimo developers are working with usability experts to improve this shortcoming.

Still, in the more important performance areas, Velocity delivers. To evaluate clustering I connected Velocity to an existing Verity Ultraseek search engine. The process involves completing two Web forms, one that describes the XML output from the search engine and the second that indicates how to parse the results. Although this does require knowledge of your original search implementation, I had clustering running in approximately 30 minutes. Vivísimo has done an excellent job organizing results into clusters by intelligently using words and phrases contained in the original searches.

Although Velocity didn’t have a specific setup for Ultraseek, there are clustering templates for other common enterprise search engines, including the Google Search Appliance; these prepopulated forms should save administrators time and reduce setup errors.

Configuring Vivísimo’s search engine required little effort. I easily created a source by defining the starting URL of an intranet Web site. Then I selected a few other options, such as the maximum link depth. Again, clustered results were very precise; no fine-tuning was required.

Next, I used the built-in search to index documents on a file server and Microsoft SQL server database. Besides handling typical file formats (Microsoft Office, PDF, e-mail archives, and Zip archives), the search engine crawls sources that require authentication, such as a content management system. In the latter case the software correctly hid results from users not authorized to view them.

With the basics done, I examined customization. Velocity offers just about all the control an enterprise requires. For example, customized HTML parsing allows me to strip unnecessary markup from pages, including navigation and sidebar links. As a consequence, result summaries were even cleaner and more relevant.


Continued
1 | 2 | Next Page » 



 


 
Mike Heck is a contributing editor for the InfoWorld Test Center.
 

TOP NEWS:


»  Four quick tips for choosing an IM security product
71 percent of businesses will invest in real-time messaging this year. If you're one of them, be sure to protect your enterprise

»  Forrester analysts ID hot IT jobs
Research group finds 16 IT roles with a promising future

»  Nvidia claims 10 hours of HD video on Tegra chip
The Tegra 600 and 650 can be used with hard disk drives and are designed partly for mobile Internet devices

»  Database vendors add Google's MapReduce
Greenplum and Aster Data Systems will support Google's programming technique, developed for parallel processing of large data sets across commodity hardware

»  Network management: Tips for managing costs
New technologies, changing requirements, and ongoing equipment maintenance and upgrades cost money, but there are ways to manage expenses

»  EMC targets SMBs, branch offices with new low-end storage
Celerra NX4 highlights include thin provisioning, snapshot technology for data recovery and backups, and Web-based console for management of storage volumes




COMPREHENSIVE DATA PROTECTION AND DISASTER RECOVERY
Traditional backup and recovery is becoming irrelevant. You need more. Watch this InfoWorld and Dell Equallogic webcast to learn the current trends in Comprehensive Data Protection and Disaster Recovery for VMware Virtual Infrastructure. Sponsored by Dell Equallogic:

»  Click here to view this Webcast
  Network Security Solutions Guide
Network security is comprised of so much more than protecting just one or two PCs. And network security management can be different based on your situation. Read this Solutions Guide to find the best ways to protect your entire network, from individual PCs to network-attached storage and more. Sponsored by ISC2

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 

FIND PRODUCTS AND COMPANIES
» COMPLETE PRODUCT GUIDE



TECHNOLOGY INDEX
• Applications
• Application Development
• Security
• Networking
• Wireless
• Platforms
• Hardware
• Data Management
• Storage
• Web Services
• Business
• Telecom
• Professional Services
• Standards

TECH WATCH 


What's the 411 on GOOG-411?
Just as Google has become synonymous with "performing a Web search," 411 is understood to mean "information" -- as in "what's the 411?" I was thus surprised to discover, from a billboard, no less, that the king of search is taking on the ...

Apple HTML source reveals 'iPhone Extreme'
"This one's a stretch..." reports AppleInsider. Um, yeah. Reporting on HTML code sightings of product names could be called a stretch, but iPhone Extreme has a ring to it. Now, that sounds like the product Apple should have released first, rather ...

COLUMNISTS

Unified under law
Ephraim Schwartz's Column and Blog (InfoWorld) - In the litigious world we live in, deploying a unified communications platform in your enterprise could...
» MORE COLUMNISTS

MORE INFOWORLD BLOGS


Open Sources 
Product Management
When I joined MySQL four years ago, there was quite a lot of debate about product management. We didn't actually have ...

Zero Day 
Botnet herders tending smaller flocks
New research backs up the theory that botnet operators are keeping their networks smaller in a continued effort to keep ...



• Advice Line
• Database Underground
• The Deep End
• Enterprise Mac
• Geeks in Paradise
• Grid Meter
• The Gripe Line
• InfoWorld Daily
• Inside IT
• IT Troubleshooter
• ITXtreme
• Open Sources
• ProdBlog
• Real World SOA
• Reality Check
• Security Adviser
• SMB IT
• The Storage Network
• Tech Watch
• Virtualization Report
• Zero Day

ADVERTISEMENT


RESOURCE CENTERadvertisement 

GOVERNMENT IT & POLICY
'If you don't go after the network, you're never going to stop these guys. Never.'
From the State Department, All the News for Inquiring Minds
TechPresident, the Internet Citizenry's New Consensus Taker



Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist
TecChannel :: TecCommunity