May 24, 2006

Researchers look to semantic Web to drive Internet

Computer scientists discuss ideas for organizing the Internet's growing mass of data

Hundreds of researchers and computer scientists are plotting the Internet's next course in Edinburgh, Scotland, this week, pouring over research papers and discussing ideas that include how to organize the Internet's growing mass of data.

Much of the discussion is centered on the "semantic Web," the term for how researchers believe information on the Web can be intelligently labeled, interpreted, and linked through applications that can draw relationships and discover buried information.

Computer scientists have grand visions for how the semantic Web will help users cut to the core information they are seeking. A few years ago, attaching keywords to Web pages was seen as the way to make orderly sense of data, but that is now increasingly viewed as inferior.

However, there is trepidation as to how this next version of the Internet will develop, and if the new ideas can be translated into applications and interfaces that are easy for users.

"I think there's a chance actually that we can do better this time around," said Tim Berners-Lee, who is credited with inventing the World Wide Web in 1989. He was one of several panelists in a discussion about the semantic Web Wednesday at the W3C (World Wide Web) conference.

"I think it's also possible we mess that up, and the Web 2.0 becomes a big mess of rather unreliable stuff which you end up having to go through with Google," he said.

Google's Pagerank feature was seen as a leap forward in search, but computer scientists are striving for far more advanced tools. The dawn of a semantic Web will not replace search engines, and the technology is seen as being complementary to other applications that will mine the data attached to Web pages.

Search engines may eventually be able to use pages optimized for semantic Web content, although Berners-Lee jokingly predicted that search companies won't be overly enthusiastic about the concept.

"Search engines make their money by making order of chaos," Berners-Lee said. "If you gave them order, then they wouldn't have a business. So that's why they are not interested in looking at the semantic Web."

Labeling information on the Internet involves tagging it with code and then classifying it into a taxonomy. Customized taxonomies and ontologies, or data models, could be created for different subject matters to connect disparate, rich information tucked away on servers.

It's an approach that differs vastly from current search engine technology, which may be able to find all instances of a keyword and rank a document's popularity but not interpret the context.

"Google is great, but I don't want go look up Exxon Mobile on Google and get six million hits," said Clare Hart, executive vice president at Dow Jones and chairman of Factiva, the company's subscription news aggregation service. "It doesn't help me if the meaningful hit is 20 links down."

The semantic Web concept can be applied to data held within the enterprises. But businesses are concerned with how to label their data and ultimately their return on investment in semantic technologies.

In the long term, businesses need to realize that developing ontologies for data is an asset, said Richard Benjamins, director of innovation and research and development at Intelligent Software Components in Madrid.

Before semantic technology will progress, businesses will have to be convinced that ontologies are manageable and affordable. "That perception does not exist yet," he said.

But cost of hunting for data is high, since it's not an efficient use of people's time, Hart said.

"What's unfortunate is the amount of time people are spending searching is increasing," she said. "The semantic Web is going to enable that to decrease, and that's I think the return on investment companies have to strive for."

Close

On Twitter now

Data management

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive Data Management Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2009 Infoworld, Inc.