Free Newsletters
InfoWorld Daily

InfoWorld
Log-in | Register

Could Google's 'dataspaces' reshape search?

'Dataspaces' concept, which stems from the work of Google researcher Alon Halevy, could take search technology and content processing to another level, analyst claims


Google -- the company most identified with Web search -- is not the leading player behind the firewall, claiming about 9,000 customers are using its enterprise search products. Independent search vendor Autonomy says it has 17,000.

Still, in his recent report "Beyond Search," for Gilbane Group, analyst Stephen Arnold portrays the company as a quietly humming engine of activity, with work under way that could "leapfrog" the current generation of search technology.

[ For more, see related story: "The future of enterprise search." ]

Arnold, who closely tracks Google's patent applications, is especially interested in a concept called "dataspaces," which stems from the work of Google researcher Alon Halevy. Dataspaces, in Arnold's view, take "content processing into a new dimension."

"A dataspace should contain all of the information relevant to a particular organization regardless of its format and location, and model a rich collection of relationships between data repositories," Halevy wrote along with two co-authors in a December 2005 paper. "Hence, we model a dataspace as a set of participants and relationships.

"The participants in a dataspace are the individual data sources: they can be relational databases, XML repositories, text databases, Web services and software packages," the paper states at another point. "A dataspace should be able to model any kind of relationship between two (or more) participants."

While other vendors are pursuing similar goals, they cannot compete on scale with Google, according to Arnold.

"Even the most robust content processing systems have not been engineered to handle Google-level content flows. The implication of scale means Google is operating largely without competition from the companies profiled in this study," he wrote in "Beyond Search."

Meanwhile, Google indeed appears to have ambitious search and content-processing projects in the patent pipeline that echo the dataspaces concept.

One in particular, U.S. Patent No. 20070198481, "Automatic Object Reference Identification and Linking in a Browseable Fact Repository," describes an invention that crunches together a wide range of data on an individual or topic into a kind of dossier.

Google declined to comment on patent applications or make Halevy available for an interview.

"We file patent applications on a variety of ideas that our employees come up with," a company spokesman said via e-mail. "Some of those ideas later mature into real products or services, some don't."

But a company executive was willing to paint the company's search in general terms.

"Inside an enterprise, and maybe unlike the Internet, you can know a lot about a user," such as who they report to, said Matthew Glotzbach, director of product management for Google's enterprise division. "There's a lot of empirical information you can derive. All of that can be used to create a very, very rich profile about the user, which can then be used to create a really rich search experience."

Do not expect Google to suddenly bring a game-changing product to market, according to Glotzbach.

"The model is not these kind of big-bang approaches where we work for multiple years and then roll something out. In terms of what we do in enterprise search, you'll see a constant flow, as opposed to one sort of big bang -- here's a whole new thing," he said.


Talkback:

commentPost a Comment

 

MOST COMMENTS

 
 





AN INTEGRATED DISK BASED DATA MANAGEMENT SOLUTION
If you feel you're wasting time and resources on costly, cumbersome data backup activities, you need a new backup and recovery solution. This exclusive webcast from InfoWorld and Dell will show you which solution best meets your needs and how to implement it Sponsored by Dell Equallogic:

»  Click here to view this Webcast
  Protection for Remote Sites and Branch Offices
This Whitepaper reviews the challenges of creating appropriate data protection, especially for small and midsize companies with remote and branch offices. It offers suggestions on how you can choose the most appropriate data protection solution for your company's needs. Sponsored by Overland

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 
 

Video

 
 
 

Podcasts

 
IFW Daily 12/01/2008

Microsoft, Yahoo dismiss report of a search deal, British prosecutors ...

 
 
 

Columnists

 
 
 

Resource Center


Ads by techwords beta  [See your link here]
 




Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist
TecChannel :: TecCommunity