August 07, 2006

Update: AOL reportedly released search data

AOL's apparent release of details of users' Internet searches raises privacy concerns

AOL has apparently released details of Internet searches performed over a period of three months by hundreds of thousands of its subscribers, raising privacy concerns.

The data, apparently made available for research purposes, is no longer available at the Web site http://research.aol.com, but details of the data were cited by technology blog site Techcrunch, and the page linking to it was cached by Google's search engine.

The cached copy of the page said the data comprised about 19 million Web searches performed by 658,000 users from March through May. The page warned of sexually explicit language in some of the queries, and said of the data, "This collection is distributed for noncommercial research use only." The page contained a link to a compressed copy of the data archive.

The page asked researchers using the data to cite a research paper entitled "A Picture of Search" based on the data, which names two AOL employees as co-authors. That paper is still available for download here.

AOL officials in London are aware of the issue, they said Monday morning. They had no further comment, and referred queries to the company's U.S. headquarters. Reached in the U.S., company officials did not have an immediate comment.

The release of such information poses serious privacy concerns. Major search engine companies fought a request for similar data on user searches last year by the U.S. Department of Justice.

The U.S. government wanted to use the data to check the effectiveness of a federal law aimed at minors' access to harmful material. In January it filed a motion with the court to compel Google to comply with its subpoena and turn over a "random sample" of 1 million Web site addresses found in its search engine index.

It also asked the company the text of all queries filed on the search engine during a specific week. America Online, Yahoo, and Microsoft's MSN were also subpoenaed, and complied to varying degrees.

The alleged release of AOL's data has sparked concern over how it might be used after its widespread release. While the original page is gone, the data has since been made available on several other Web sites.

The data is valuable from a market research perspective, said David Bradshaw, principal analyst at Ovum. Normally, similar kinds of data sets are only released to trusted researchers, not the general public, he said.

Even then, the resulting research is released as a batch of aggregated statistics, masking signs of individual users' behavior, he said.

"I do think this was foolhardy at best and a complete disaster or worse for AOL," Bradshaw said. "If I were an AOL user, I'd be up in arms."

The researchers who used the data wrote in an introduction that user IDs were replaced with an anonymous number. However, observers are expressing concern about whether users could be tracked based on their queries.

The data also contains the time when a particular query was executed. If a user clicked on a result, the rank of the item was recorded, along with the domain portion of the URL (uniform resource locator).

Close

On Twitter now

Security

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive Security Resource Alerts

Subscribe to the Security Central Newsletter

Stay informed of the latest security threats and fixes.

White paper

Log Management: How to Develop the Right Strategy for Business and Compliance

This white paper provides guidance on how to develop a strategic approach to managing and monitoring logs, a key function required for compliance with many regulatory mandates and a critical defense against security threats.

Download now! »

White paper

The Essential Series: Security Information Management

Learn about the processes and technologies that support security information management (SIM) operations, as well as the business case for SIM. The series examines different options for implementing SIM and gives you evaluation criteria for selecting the best option for your organization.

Download now! »

White paper

Aberdeen: Choosing and Consuming Managed Security Services

Learn the strategies, actions, and capabilities that Best-in-Class organizations employ and technologies they choose to obtain superior performance against various security performance metrics. This report provides guidelines for identifying which security solutions to consume as a MSS and defines best practices for choosing and managing MSSPs.

Download now! »
©1994-2009 Infoworld, Inc.