This is extraordinarily valuable in theory, true, but worthless if we're unable to extract the important nuggets. To mine this gold, the Obama Administration announced its Big Data Research and Development Initiative in March. Five agencies made about $200 million in new commitments toward improving big data tools and techniques: the aforementioned NIH and USGS plus the National Science Foundation, the Department of Defense (DOD), the Department of Energy (DOE). The data challenges these agencies and departments face range from better use of the DOE's supercomputers for crunching scientific data to facilitating "rapidly customizable visual reasoning" for diverse DOD missions.
These are valuable nuggets, to be sure, and, in the grand scheme of things, $200 million is a bargain. But the administration's investments in Big Data don't stop there. In August the White House announced its Presidential Innovation Fellows program, which brings a crack team of innovators together to collaborate on projects with the goal to "improve the lives of the American people, save taxpayer money and fuel job creation." On the initial list of target projects are Blue Button for America, an extension of the Department of Veterans Affair's Blue Button initiative, as well as an open-ended set of projects the White House calls Open Data Initiatives.
The Open Data Initiatives have a different mandate than the Big Data Initiative, but the synergy between them is obvious. Open Data focuses on "liberating" government data (as well as contributed corporate data) in order to achieve the strategic goals of the Innovation Fellows program.
What does it mean to liberate data? The two examples cited are NOAA weather data (now at the core of every weather report on television) and the Global Positioning System, without which we'd all literally be lost.
Of these examples, NOAA weather data most obviously present big data challenges. The value in such large data sets doesn't simply depend on the weather data themselves, but in the ability to forecast weather based upon those data-a classic big data problem. From the perspective of the American citizen, we value accurate forecasts; the immense quantity of historical weather data that feed the forecasting engines is merely the ore we must mine to find the nuggets we desire.
Such is the challenge facing the Open Data Initiative. The more data we have, the less we value the data sets themselves. The information we truly desire lies buried under increasing quantities of irrelevant or otherwise useless information. The danger is that the more data the government provides us, the better hidden are the nuggets we desire. In other words, in the absence of effective big data solutions, truly open government may be out of reach-or, worse, misapplied to obscure the very information that citizens would find most valuable.