exalead and Siderean guide users down differing paths to data troves
Competing search tools effectively group data and guide users
Search speed was excellent in my test of about 25,000 documents -- with results typically displayed in 0.05 of one second. Just as significant, searches produce easily understandable information. exalead generates thumbnails of documents and Web pages, shows a summary of the information (including where it resides in the source directory structure), and it provides a preview window.
exalead’s categorization also truly enhances the whole search process. For instance, I searched for a certain data leak review I wrote last year. Not only did exalead:one find that article on infoworld.com and my local drafts in Microsoft Word format, but exalead returned related Web links to security executives and articles about managing e-mail security.
Although to date, exalead:one websearch has indexed more than 4 billion public pages, Google and others needn’t fret about exalead encroaching on their leadership in consumer search. In fact, exalead offers a feature to federate Google public searches. But exalead:one represents a key trend of organizing -- and integrating -- public sources for specific research.
Indexing up to 200,000 documents, exalead:one enterprise’s straightforward GUI let me setup connectors for crawling SQL and Notes databases, file shares, and intranet Web sites. During crawls, exalead converts data to XML, analyzes it, and indexes it. Additionally, I had great control over the process, such as specifying which categories appeared in search results. Crossing over into the facet realm, admins can import, reuse and edit classifications from existing taxonomy projects -- or create new cataloging systems by extracting metadata from documents.
Organizations can install one:enterprise as their sole search solution or couple it with one:desktop in the latter case. I merely added the enterprise server’s index and thereby federated local and enterprise results.
exalead:one products all provide simple, but not simplistic, search, with quick and easy setup. A unified interface combines results from multiple sources, and the applications were strong performers in deriving structure from documents and automatically generating categories.
Siderean Seamark Navigator 4.0
Siderean’s enterprise search solution includes three main modules. Seamark Navigator 4.0 finds and indexes content found within RSS feeds and in enterprise databases by recognizing existing metadata -- which is then encoded according to the RDF (Resource Description Framework) open standard. After aggregating sources you specify, Navigator organizes the information into “facets” presented in a browser interface.
Seamark MAPP (Metadata Assembly Processing Platform) is an entity-extraction system that harvests metadata from unstructured sources, including Microsoft SharePoint and file systems. This application, which uses IBM’s open source UIMA (Unstructured Information Management Architecture) framework, also integrates with commercial products such as Lexalytics and Lockheed Martin’s AeroText.
Compared to the other solutions, Seamark required additional time for me to engineer a working search app. However, the Seamark Administration UI clearly maps out the necessary steps, so I didn’t expend much effort learning the system. I started by specifying feeds -- which can be XML documents, database queries, and direct input from supported enterprise search engines.