Big Data mining: Who owns your social network data?

An attractive application of Hadoop and other Big Data technologies is to analyze users' social activities, sometimes without their express knowledge

1 2 Page 2
Page 2 of 2

But who owns the data?
Mozilla's Heilmann views Big Data as any information accumulated on the Web -- any real-time data. But who specifically owns this data? "That's a very loaded conversation," he says.

"I think it's dangerous right now that the speed and beauty of these interfaces [on sites such as Facebook] make people give information away without realizing that they have done it," Heilmann says. For example, people can upload photos of themselves intoxicated and a potential employer can view them for at least some time afterward.

"You have a real problem deleting anything from the Internet," Heilmann stresses. "As soon as you put it up there, it will be cached, it will be copied somewhere else. You should be very mature about what you put online."

GigaOm's Harris says ownership of the data depends on circumstances. "Certainly, the companies generating it own the data," he says.

Although there is publicly owned data on the Web, Facebook and Twitter, for their part, own the data their users generate, Harris notes. And Big Data concepts such as data marketplaces have resulted, for example, in firms analyzing Twitter streams for a month at a time, Harris says. "There's a lot [of data] that's just available out there if you could harness it" and analyze it.

Cloudera's Awadallah says the question of who owns unstructured data is a hard one to answer. Data such as customer purchasing information in Apple's App Store belongs to Apple, he says. And although Google gives users to the right to delete data, it still owns the data itself, he adds.

Thus, the Data Portability Project for porting of social network data promotes the notion that users own their own data and social networks should make it easier for users to move it around. The effort has produced an initiative that aims to get sites to disclose what users can do with their data once it has been uploaded, says Saad, who in addition to his Echo job is co-founder of the Data Portability Project.

Still, Saad notes that in some cases users share ownership and custody of their data with the online services they use. "It's kind of like money in a bank. You own the money but you are basically giving it to the bank to safeguard for you and potentially use on your behalf," he says.  

The issue is not just about privacy. One of the tenets of Big Data is to analyze data from multiple sources to identify trends, business opportunities, market shifts, potential customers, customer sentiment, and a lot more. When Big Data tools analyze information available on the Web, do they really have the right to do so without permission of the owner?

"It really depends," Saad argues. "If you're publishing on the public Internet, I think the social contract is such that people expect their data to be polled and crunched and indexed and used." On the other hand, "it's a little difference when Facebook, for example [is] expected to be a private network and it continues to push the boundary of what part of your information is made public. That's when it's violating the social contract."

This story, "Big Data mining: Who owns your social network data?," was originally published at InfoWorld.com. Follow the latest developments in business intelligence at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Copyright © 2011 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2