Even the best Web search engines deliver so many hits that users overlook relevant documents or find it too bothersome to explore a subject further. Alas, many attempts to hone search results fail because they rely on document preprocessing or manual classification that introduces inaccuracies and delays. Vivísimo’s Clustering Engine sidesteps these costly taxonomy projects by organizing search and database queries into meaningful, hierarchical folders on the fly.
This hot technology, which powers the company’s public Clusty.com search site, now has a corporate sibling. Vivísimo Velocity is a Linux server application, front-ended by a Web administration tool, that combines dynamic document clustering, crawls of as many as one million enterprise documents, and metasearches of an unlimited number of other search engines or documents.
I was impressed with how well Velocity federated search results from internal search projects and external sources. In a few hours, I’d combined intranet searches from Convera RetrievalWare and a Google Search Appliance, several external Web sites, plus documents on a local file server. Significantly, this was done without fiddling with settings; therefore, I believe full implementation shouldn’t require special IT expertise.
At the next level you can customize crawls so there’s no need to touch original content. For example, I built an XSL style sheet to recognize the existing structure of a subscription news site — thus multiple articles listed on an index page were correctly recognized as separate documents and placed in correct clusters.
Velocity appears to live up to its name with speedy implementation while precisely integrating results in easy-to-navigate clusters. Its arguably high price is far lower than the cost of multiyear search projects that never return their investment.
Vivísimo Velocity
Vivísimo
Cost: Starts at $10,000 per year for 50,000 documents
Available: End of November
This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.
Download now »Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.
Download now »
The emergence of WLANs has created a new breed of security threats to enterprise networks.
Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation
Effectively address data protection challenges, implementing solutions that help store and protect businesscritical data while cutting costs and improving efficiency and reliability.
Download now »
Sign up to receive Data Management Resource Alerts
