Experts say that in the future, predictive analysis will advance to the point where it can tease out information about people's lives and preferences using far more, and far more subtle, data points than were used in the Target case. The inductive models that some companies already use are huge, containing up to 10,000 different variables -- each with an assigned weight based on its ability to predict.
But big data analysis may have a built-in public relations problem because its way of predicting human behavior seems to have little to do with human behavior. Unlike traditional analysis, which seeks to predict future preferences or behaviors based on past ones, the field's inductive analysis concerns itself only with patterns in the numbers.
After Target "targeted" baby ads at women it thought were pregnant, the women and their families criticized the company's tactics. They were creeped out by the ads because Target's inference about them could not be mapped to any piece of data that they had already provided. Even though Target was correct in its inferences, it was simply not intuitive that the purchase of cotton balls and lotion would predict that the buyer was pregnant and would soon be buying diapers.
More than anything else, this new, mathematical method of analysis may force us to look at our privacy and the way we manage our personal data in a whole new light. After all, it's unsettling to know that hundreds of unrelated bits of our data can be pulled together from a hundred different sources (perhaps verified by fingerprinting technology like BlueCava's) and analyzed to reveal numeric patterns in our behavior and preferences.
"Even the smallest, most trivial piece of information might be strung together with other pieces of information in a pattern that is sufficient enough to infer something about you, and that's a challenging world to live in because it upsets our basic intuitions about discretion," Barocas says.
Transparency, inclusion might help everyone
When Target realized its baby-products ads were getting a negative response, it didn't pull the ads; instead, it elected to hide them among unrelated and less-targeted ads when showing them to pregnant women. Rather than asking female customers if they were interested in special offers for baby products, the company chose to infer the answer in secret.
And that lack of transparency may be the single biggest objection to consumer tracking and targeting today. Advertisers are spending millions to combine, transmit, and analyze personal data to help them infer things about consumers that they would not ask directly. Their practices with regard to personal data remain hidden, and they're acceptable only because people don't know about them.
Such tracking and targeting also feels arrogant. Consumers may not mind being marketed to, but they don't want to be treated as if they were faceless numbers to be manipulated by uncaring marketers. Even the term "targeting" betrays a not-so-friendly attitude toward consumers.
Ironically, advertisers might be far more successful if they pulled back the curtain and included consumers in the process. It's well known that the personal data in the databases of marketers and advertisers is far from completely accurate.
Maybe, as several people I talked to for this story pointed out, the best way to collect accurate data about consumers is to just ask them. And if an advertiser is hesitant to ask for a certain piece of personal data, the advertiser shouldn't infer it.
"What our organization is trying to work out is whether or not there's a way to [collect personal data] where the user knows what's happening and companies [get] their data not by stalking [users] but by asking them," says the Personal Data Ecosystem Consortium's Hamlin.