Will open data survive Trump?

The US federal government collects vast quantities of data on hundreds of topics and makes it publicly available. Those efforts now face an existential threat

Fears of Trump prompt open data’s underground railroad
Unlimited

Nothing in the world is more dangerous than sincere ignorance and conscientious stupidity. -- Martin Luther King Jr.

Imagine driving car without a dashboard. Not only would you break the speed limit and soon run out of gas, but in a new car, you’d lose your engine health readouts, rear-cam view, and other telemetry.

I’ve always seen open data as the telemetry we need to drive democracy. Sure, open U.S. federal government data tends to be widely dispersed, difficult for ordinary people to digest, and often ignored by the news media. But for years vast quantities of open data about everything from agriculture to mining to education to energy have been available for anyone to peruse or download from government websites.

What sort of data are we talking about? For starters, check out the Open Data 500, an NYU project funded by the John S. and James L. Knight Foundation, which lists hundreds of federal government data sources -- and 500 private companies that depend on them. 

Open data has always faced challenges: institutional inertia, attempts to conceal incompetence, and so on. The Obama administration gave open data a major boost with Data.gov (a catalog of open federal government data) and other initiatives, including The Opportunity Project (a project to jumpstart open data apps) announced last March. But now, as we enter the Trump era, open data may face its ultimate test.

Last week I spoke with Alex Howard, deputy director of the Sunlight Foundation, a nonpartisan transparency organization. He was reluctant to make predictions about what the Trump administration would or would not do. But citing Trump’s refusal to disclose his tax returns and full medical records, Howard stated the obvious:

It is not hyperbole to say that Mr. Trump was the least transparent candidate in modern history … there’s nothing that is particularly positive that would lead one to believe that he’s going to adhere to the traditional democratic norms around transparency and accountability unless they’re mandated by law.

This is one reason why, as the Washington Post reported, to preserve records of climate change “scientists have begun a feverish attempt to copy reams of government data onto independent servers in hopes of safeguarding it from any political interference.” A guerilla archiving event hosted last month by the University of Toronto focused on preserving climate change information by copying "the federal online pages and data that are in danger of disappearing during the Trump administration."

Kin Lane, a self-styled API evangelist and ex-Presidential Innovation Fellow has done a ton of pro bono API work on government data. He told me that many different groups are currently engaged in backup efforts. Lane himself has already backed up the Data.gov index to GitHub.

In an excellent post on Nate Silver’s FiveThirtyEight site last month, Clare Malone details the many ways open data could be undermined, particularly where data is made public due to custom rather than regulatory requirement.

For example, the oft-cited Current Employment Statistics Survey from the BLS (Bureau of Labor Statistics) is voluntary under federal law. As Howard puts it, “the major concern, I think, is that if you have a president-elect who says the real unemployment rate is 40 percent or whatever he says it is and the BLS says heck no, it’s actually 4.7 percent and underemployment is 13.7 percent and labor force participation rate is 62.7 percent, what will happen then?” 

Kin Lane has a particularly pessimistic view of what's ahead. He believes "most projects will go dormant -- some will go away entirely. I think open data will gain a bad reputation and be seen alongside regulation." 

Howard offers a couple of signs to watch. One would be starving agencies of funding, which would reduce sample sizes and otherwise undermine data quality. This in turn would make it easier to cast doubt on the data itself. “If the data clearly shows that hydraulic fracturing has an impact upon earthquakes, which the USGS has demonstrated in Oklahoma, if that’s politically inconvenient to business interests, what do you do about that? Do you fund it? Do you reduce the periodicity of data publication?”

The incredible quantity of data collected across the federal government is a national treasure. Few other countries on earth apply the same energy, funding, and rigor to assembling such extensive stores. Even if ordinary citizens don't go to Data.gov for entertainment, both policymakers and business leaders need objective data to make sound decisions.

Before joining the Sunlight Foundation, Howard worked at O’Reilly Media, starting there a few years after Tim O’Reilly convened a group of open government advocates to develop the eight principles of open government data in 2007. Howard says the idea of open data really goes back to the Constitution, which stipulates an "Enumeration" (aka, census) be held to apportion Congressional seats -- an indication that "open data is in the DNA of the USA." Even further, open data harkens to the original Enlightenment idea that reason based on fact should govern human action.

We'll see how that quaint notion survives the postfact era. Meanwhile, consider contributing to the Sunlight Foundation and the Electronic Frontier Foundation.