What open data really means for government

The open data movement purports to cultivate an informed citizenry and rescue government offices from the dark ages -- but can it tell me which streets to avoid on my bike?

government congress house of representatives cloud
Credit: Amanda Walker

I recently attended the Triangle Open Data Day (TODD) conference at North Carolina State’s campus in Raleigh, an event sponsored by Code for America. On the surface, the open data movement is about getting local, state, and federal government to publish data in a way that citizens can use it. This means publishing reports, spreadsheets, and other documentation online -- and it scales to mean publishing data sets with APIs, so applications all over the place can build upon and extend those data sets.

This is merely the surface. Just like HealthCare.gov’s well-publicized launch failure wasn’t really about a website crashing, but about back-end systems' inability to hold up, open data is about how the back end of your state and local government work.

As a business owner, I do not deal in a lot of paper. I occasionally have contracts to sign, but increasingly that's done electronically. That said, each and every interaction with paper I’ve had over the past several years has been government based.

I’d venture to guess that if you walk into most offices these days, you'll see few if any file cabinets -- and not many of those contain paper files. Walk into your state or local government, however, and it's often like a trip to the '60s or '70s without Joan, the booze, or colorful fashions. To achieve “open data,” governments need to undergo cultural and technology transformations. Much of this is already under way, but open data initiatives serve as accelerators.

What kind of data are we talking about?

The TODD event consisted of presentations mirroring the communities that were presenting. One very dynamic politician -- Lori Bush from Cary, N.C. -- came with her technical entourage, which showed how online construction permits can be color coded, plotted on a map, and adjusted with time. (Cary is known for its pastel housing developments and homeowners associations that measure your grass height, so it has the money for open data initiatives. It even declared an Open Data Day.)

My community, Durham, is known for diversity, great restaurants, a burgeoning art and music scene, and Duke University -- as well as poverty and homelessness. It has a reputation for violent crime, though statistically it's average for American cities of its size.

Durham has recently started its open data initiative. The CIO of Durham City mentioned that as important as open data is, if you walk two or three blocks from City Hall, you enter a neighborhood of people to whom such data would not seem relevant. One of the data initiatives it is undertaking tracks homelessness, although the city is making sure to do this in a way that avoids misuse or invasion of individual rights.

Ultimately, open data events like TODD go beyond what municipalities think is important. While my state’s senator has voiced skepticism toward the government’s role in making sure we do not die of hepatitis by supporting restaurant workers' rights to wash their hands with freedom instead of soap, one industrious fellow, Jamie Dixon, showed the interesting things you can learn about Wake County restaurants through its health inspections.

Another group did something near and dear to my heart. It sampled Tupac Shakur, plotted the state bike crash data on a map, and called it Ride or Die.

Publishing open data

All of the open data platforms can be described as a simple CMS plus code to help turn CSV or similar files into JSON. Some also provide common glue code to autoplot GIS data against Google Maps. Most also have something that tastes a lot like Google Analytics to see how the site is used.

I was surprised that my municipality decided to use a proprietary SaaS product as the first step before deciding what it wanted to publish. OpenDataSoft seems to be making the rounds. I spoke to a fellow filming the Durham part of the presentation, and he said they found it was simply the easiest to use.

My first reaction to reading about OpenDataSoft: This seems like Drupal plus stuff to turn this into JSON. After a little Googling I found that indeed, there were Open Data plug-ins for Drupal from a project called DKAN.

DKAN seemed like an unlikely name -- turns out it's a knockoff of one of the most popular platforms, CKAN. CKAN looks about as easy as OpenDataSoft, but is actually open source and, like OpenDataSoft, still offers cloud-based hosted solutions.

One of the fellows at the conference was from Socrata, another platform that seemed to have more comprehensive feature sets than any of the above and is at least partly open source.

Assessing the supposed social effects

The idea of open data is that an informed populace will interact with its government more efficiently and effectively. Moreover, an increasing amount of people looking at the data might find better or different solutions. Finally, a government not trapped in the era of midcentury modern architecture and Don Draper might run better or with greater efficiently.

There is some anecdotal evidence that this is happening. For instance, the Wake restaurant data revealed that Chinese restaurants were graded more harshly than others, and rating variations among inspectors came to light. On the other hand, a different study sheds doubt on all of this. The truth is that it is probably too early to tell.

For myself, I want to get my “am I riding into a deathtrap?” bike app together. I envision cycling along with my bone-conduction headset, Strava, and hopefully a heart monitor that works, complete with voice warnings backed by government and crowdsourced data noting the likelihood that some jerk in a large pickup truck will mow me down.

Will open data change government? Maybe or maybe not, but it sure might change me. Ride or die, my friends -- ride or die!