July 27, 2007

Update: Wikia search engine gains a Web crawler

The Wikia project to create an open source search engine akin to Wikipedia has acquired the Grub Web crawler and has released it under an open source license

Wikia's project to develop an open source search engine got another boost with its acquisition of the Grub distributed Web crawler, the company announced Friday.

Wikia acquired Grub from LookSmart and released it under an open source license, adding a significant component to Search Wikia, scheduled to debut in this year's fourth quarter.

The Search Wikia project seeks to create a search engine based on open source search protocols and human collaboration, drawing from the concept of the Wikipedia online encyclopedia, which is written and edited by a community of volunteer collaborators.

As such, it will provide a better search experience than the ones offered by commercial, proprietary search engines like those from Google, Yahoo, and Microsoft, said Jimmy Wales, Wikia's co-founder and chairman, and Wikipedia founder.

However, people shouldn't expect Search Wikia's first version to deliver on this lofty goal. "When I opened Wikipedia with only three articles in it, it was a pretty bad encyclopedia," Wales said. "That's where we're going to be in December with the search engine: We'll tell people upfront: 'It's not very good yet.' But we'll open it up to get feedback and community involvement to help us make it better."

Wikia, which has about 35 full-time employees, is developing its own relevancy technology to rank search results, he said. At this point, it's not clear whether at launch Search Wikia will offer only general Web searching or if it will have specialty engines for things such as images and news.

Visually, Search Wikia will have the standard search-engine interface. "The differences will be that at various points all along that search process, there'll be opportunities for people to engage with the community and join and participate in the construction of the search results," Wales said.

Since Search Wikia's announcement in December, an increasing number of organizations, such as search and online advertising player LookSmart, have expressed interest in the project, Wales said. In the coming months, Wikia will provide more details about third-party partnerships and support.

"A lot of second-tier [search] players understand that competing with Google directly as independent proprietary projects, they'll never catch up because they don't have enough individual resources. But by banding together using open source software, they can effectively compete with Google and improve their services that way," Wales said.

In fact, Wales believes that supporting Search Wikia would be in the best interest not only of second-tier search engines but also of search leaders, including Google. Search isn't a "defensible business" because it's very easy for people to switch among providers, so just like Google unseated AltaVista, another large competitor could emerge, he said. Thus, democratizing search in the way Search Wikia will attempt to do is a good thing for Google.

Close

On Twitter now

Data management

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive Data Management Resource Alerts

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2009 Infoworld, Inc.