August 06, 2004

Edmunds.com deploys text mining tool for user forums

Online vehicle Web site to use Attensity PowerDrill technology

Edmunds.com, an online service for vehicle information, unveiled its latest tool to mine the potentially invaluable data stored as unstructured content in its user forums, consumer ratings, and reviews archives.  

Currently, Edmunds.com has more than 2.5 million messages and 100,000 car reviews with consumers providing personal reviews, lists of favorite features, and suggestions for improvement. 

In beta trials now, Edmunds.com will be deploying a technology from Attensity called PowerDrill that converts written language into relational data.

PowerDrill takes the unstructured data, namely sentences, and diagrams the sentences placing each part of speech, such as noun phrase, verb phrase, and prepositional phrase, into a separate field, actor, action, and object which can then be used by a standard database to discover relationships and trends.

Although text mining for content intelligence is covered by a number of other companies such as ClearForest Tags, Inxight SmartDiscovery, and IBM WebFountain products, Laura Ramos, vice president at Forrester Research, called Attensity's diagram capability unique.

"The use of diagramming rather than rules or examples is more accurate and specific," said Ramos.

In a sentence such as, "the bolt on the under-carriage of the car is cracked due to heat," other products that use linguistic or grammatical rules would assume the under-carriage was cracked because that word was closest to the verb, Ramos said. However, because Attensity diagrams the sentence, it understands that the bolt, not the under-carriage, is cracked, said Ramos.

Attensity integrates the relational data it created from text with other pre-existing structured content and outputs the result in any format, including XML, said Craig Norris, Attensity CEO. "PowerDrill can diagram Moby Dick in five seconds," he added.

Using PowerDrill, Edmunds.com plans on tabulating suggestions for improvement and the ranking of favorite features.

In a test with Honda Odyssey, a highly anticipated 2005 car model, information from drivers of previous model years was tabulated, allowing Edmunds.com to show that the most needed improvements were in road noise, transmission issues, and styling.

Edmunds.com was able to analyze trend information from conversations on the forums, including shopping and dealer behavior, re-occurring issues, and concerns which can also be used to predict future behavior.

Companies such as Edmunds.com as well as government agencies are suddenly waking up to the richness of unstructured data, according to one industry analyst.

"They may have been aware of its existence but didn’t think they could extract any value out of it," said Nick Patience, a senior analyst at The 451 Group.

Ephraim Schwartz is an editor at large at InfoWorld. He also writes the Reality Check blog.
Close

On Twitter now

Platforms

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive Platforms Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2009 Infoworld, Inc.