February 28, 2003

Search tools look for context

Information retrieval gains improved relevance

Attempting to improve the accuracy of corporate information retrieval technology, several vendors are incorporating techniques designed to identify people, proper names, and places buried within text and analyze relationships between those assets.

To that end, unstructured data management vendor Recommind on Friday rolled MindServer 2.1, which adds entity extraction capabilities to its concept-based search and classification system.

Referred to as both entity- and fact-based extraction, the technology complements search engines, content management systems, and portals by helping improve relevancy of results, according to analyst Laura Ramos, director of research at Giga Information Group, in Cambridge, Mass.

"The key thing here is this ability to pull specific words and phrases out of documents and automate figuring out what they mean," Ramos said.

"With keyword search you are looking for individual words, but individual words can mean different things in a context. For example, [a search for] 'President Bush lives at the White House' means something different than 'white house paint.'"

The ability to pull out names, places, and dates via entity and fact extraction is becoming critical because it helps disambiguate information retrieval and make it more relevant, Ramos added.

MindServer 2.1 delivers accuracy and analysis capabilities by combining the ability to identify people, product names, and places with retrieval and categorization, according to Bob Tennant, CEO of Recommind, in Berkeley, Calif.

"What it allows is analysis of the textual data; cutting [the data] not just by subject matter but by other elements that can be identified, [such as] who individuals are, what the products are," Tennant said. "This lets organizations take cuts of this data and make it more usable."

Another unstructured data management player, Inxight Software, is ramping up fact extraction capabilities in its SmartDiscovery information retrieval product, which combines natural-language processing, linguistics, and classification technologies.

Inxight's technology has gained strong traction within government agencies where it is often used to further counter-terrorism and intelligence efforts. The Sunnyvale, Calif.-based company this month signed 10 contracts worth $3 million dollars with the U.S. Department of Defense.

Later this year Inxight plans to roll out an updated offering that adds to its natural-language processing platform the fruits of its acquisition last August of the technical assets of WhizBang Labs, which developed technology designed to extract facts and relationships from unstructured data.

The new technology "goes beyond identifying things to establish the relationships between things," according to David Spenhoff, vice president of marketing at Inxight. "Extracting facts and relationships provides a higher level of understanding beyond what search engines can provide."

Close

On Twitter now

Application development

Powered by Twitter

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Developer World Newsletter

Receive a weekly roundup about the art and science of software development.

©1994-2009 Infoworld, Inc.