Big data and elections: The candidates know you better than you know them

Most political campaigns emphasize providing information about a candidate to voters. But in the era of big data, they are also collecting information about voters – with little or no control, consent, or security

With the presidential nominating conventions looming, the candidates are getting ready to add to the hundreds of millions they've already spent to tell you about themselves -- but only what they want you to know about themselves.

Meanwhile, they have also been spending millions of dollars collecting information about you -- and you have no say in what is collected.

Which means that, in the era of big data, if you're a potential voter, they know a lot more about you than you know about them.

The desire to know what will turn a voter to, or from, a candidate is not new, of course. Campaigns have been chopping up voters into interest groups for decades -- minorities, gays, blue-collar workers, soccer moms, the religious right, progressives, boomers, NASCAR dads, union members, retirees, the rich, plus a host of occupational groups ranging from health care to law to the food and beverage industry.

They have been tracking voting history, political contributions and volunteer history as well.

But the information being collected now is much more, as they say, "granular." It includes social media -- everything from "friends" and "likes" on Facebook to YouTube views, LinkedIn profiles, activity on Pinterest, Tumblr, Instagram and Reddit to who a person follows on Twitter, or who they retweet.

[ ALSO ON CSO: When tech trips up presidential candidates ]

It includes magazine subscriptions, the types of cars or boats they own, where they shop, charitable contribution history, memberships, where they live, whether they rent or own a dwelling, whether they have a vacation home, permits and licenses, own a gun, and more.

All of which is designed to help candidates "micro-target" their message to groups of voters. They call it better communication, although it has an obvious element of manipulation to it.

joseph lorenzo hall

Joseph Lorenzo Hall, chief technologist, the Center for Democracy & Technology

"It can be as simple as swapping out a phrase that might have been found to be more appealing to one kind of voter, via focus groups, etc., or more complicated things like changing the visual demographics or traits of people appearing in ads," said Joseph Lorenzo Hall, chief technologist at the Center for Democracy & Technology.

Josef (Joey) Ansorge, New York attorney and author of "Identify & Sort," which includes a focus on the political implications of big data, said the ZIP code is among the most important pieces of information collected because, "where they live, where they work and where they went to school tell us a lot about individuals."

When it is correlated with information gathered from contacts, then, "calls or visits inform the campaign how an individual is tending to vote."

josef ansorge

Josef (Joey) Ansorge, New York attorney and author of Identify & Sort

This, he said, lets campaigns create "micro" groups of voters, the most important of which is those considered "sway-able." Obviously, that is the group the campaigns will try the hardest to influence.

But such detail about people's lives, preferences and opinions -- even their personal health -- also raises both privacy and security concerns. How many people have access to it? How well is it being protected from online attacks? Will it be discarded after the election is over, or kept indefinitely? Could it be used by those who get elected and want to track those who supported their opponent?

Ansorge has a problem with using big data to send very different messages to different groups. "There is an elemental universalism to democracy that is undermined by these kind of practices," he said, adding that he thinks voters ought to be made aware of how campaigns feed them information based on their profiles.

Andrew Hay, CISO of DataGravity, said he is not overly concerned about the collection of voter data itself, or even the tweaking of the message. "Candidates have a lot of information to remember, and the analysis of data simply helps them match the needs and wants of clusters of voters to a particular message," he said.

But he said data security and governance is crucial. "I'm less concerned about the government keeping a ‘burn list' of clusters of voters and more concerned with the protection, retention, and destruction of the data collected," he said. "This includes raw data as well as any derived analysis from said data."

andrew hay

Andrew Hay, CISO, DataGravity

That is also the view of Brenda Leong, senior counsel and director of operations at the Future of Privacy Forum. big data analytics offers, "great new ways to engage with voters on the things that really matter to them, which results in more motivated, and hopefully better informed, participants in the electoral process, and likely higher turnouts on election day," she said.

But she said "proper handling of the data" is not always easy for campaigns that tend to ramp up quickly from nothing to, "multi-million-dollar -- even billion-dollar -- enterprises, made up with large sections of volunteers or temporary staff. 

"Every campaign needs to treat security and privacy needs seriously, and have meaningful training for workers. We strongly recommend that every campaign have a chief privacy officer to monitor just these issues," she said.

brenda leong

Brenda Leong, senior counsel and director of operations, Future of Privacy Forum

Ansorge agrees. "These databases have afterlives that are not under the control of the government or the party," he said. "There is always a risk of abuse, by domestic and foreign actors. Here there is a perfect storm of data collected for a specific purpose potentially being abused for another."

Unfortunately, there is ample evidence that it is more than just potential. Just three weeks ago, MacKeeper security researcher Chris Vickery discovered that a client of the data brokerage firm L2 was hosting a database with 154 million U.S. voter registration records and, "leaking information on a dizzying array of intimate details, including gun ownership, Facebook profiles, address, age, position on gay marriage, ethnicity, email addresses and whether a voter is ‘pro-life.'"

That wasn't the only case. Six months earlier, Vickery discovered a "misconfigured" voter database with 191 million voter records -- including his -- that was, "just sitting in the public, waiting to be discovered by anyone who happens to be looking," according to CSO's "Salted Hash" columnist Steve Ragan.

Vickery told Ragan he was outraged to see his own record with, "details that could lead anyone straight to me. How could anyone with 191 million such records be so careless?"

Yet another breach, of 56 million records, included 19 million profiles that had not only voting history but also personal information like "Christian values, Bible study, and gun ownership."

Hall said those cases, along with nation-state hacking of campaign information systems, make it obvious that voters should be concerned about the data collection of modern political campaigning.

"Campaigns only seem to care about the security of data when they're protecting it from their political rivals," he said. "Voters should be especially concerned because there are zero repercussions for campaigns mistreating or improperly protecting these data. The FTC has no jurisdiction over non-profits -- there are serious First Amendment problems with government telling political speakers (campaigns) what to do.

"And there is zero chance that politicians will pass laws that reduce their capacity to micro-target, even if it means more robust protection of voter data."

Beyond that, political databases are more likely to be hacked because they are shared more than those collected by commercial companies. Leong noted that, "companies routinely promise not to share your data, but campaigns and political advocacy organizations share data as a standard, so reading the disclosures or policies when submitting data is more important than ever.

"If you sign up for a particular cause or issue, that organization is likely telling you that they intend to share that information with ‘like-minded' organizations, and you will end up on the mailing list for multiple causes," she said.

Hall agreed. "If you donate to a campaign, one of the first things you see -- and will see periodically after that -- is a ‘We'd like to get to know you better!' survey," he said, adding that they will seek information on things like gun ownership and views on abortion, "that the campaign can't easily infer or purchase from other sources."

He said even when voters volunteer that information, he is not sure they understand that it is used to get, "highly granular information about the voters for targeting, and in a number of cases this year, to get information about households around a given voter's address that might not be as forthcoming or politically involved, such as, ‘Do you know if any of your neighbors are gun owners too?'"

Ansorge said he thinks it would not be too difficult to create laws to limit data collection, especially governing presidential campaigns. "Candidates would self-discipline and would not want to create the potential scandal of their campaign being identified as law-breaking."

He said voters could decide to give more of their personal information to the campaign they support -- "we could think of it as donating your data," he said -- but the choice would be up to them.

Given the detail of the data collected, there is general agreement that there should be regulations on destroying it after a campaign ends.

Hay recommended that the U.S. adopt something like the General Data Protection Regulation (GDPR) in the EU, "specifically the Right to Erasure (right to be forgotten) language.

"If, as a citizen, I give consent to my data being collected and used in this manner I should also have the right to request what has been collected and the right to have it erased," he said.

This story, "Big data and elections: The candidates know you better than you know them" was originally published by CSO.

Copyright © 2016 IDG Communications, Inc.