Tapping the richness of weather data
Although best known for the friendly forecasts on The Weather Channel and Weather.com, The Weather Company has spent two years branching out and fashioning itself into a provider of a powerful big data analytics platform. Today, the Atlanta-based company's WeatherFX data service ingests more than 20TB of data per day, including satellite pictures, radar imagery and more, from more than 800 public and private sources. By crunching terabytes of information into insights that impact the bottom line, WeatherFX is helping insurance companies, media conglomerates and airlines save money, drive revenue and satisfy customers.
For example, by mashing up hail data with policyholder addresses, insurers can alert homeowners to potential damage to their homes and cars. "By warning customers of pending dangers, insurers can encourage customers to protect their personal property, which lessens the impact of claims on insurers caused by bad weather," says Bryson Koehler, CIO at The Weather Company.
Airlines also use weather data. They may, for instance, monitor storm patterns and reposition aircraft to avert scheduling delays. And retailers are discovering that keeping track of the weather can help them anticipate consumer demand and thereby boost sales -- they might, for example, stock their shelves with anti-frizz hair products when a heat wave is expected.
Still, packaging 800 sources of data, much of it open, requires heavy lifting on the part of The Weather Company's IT department. Koehler says the company had to assemble "an incredibly complex environment" to manage "a dog's breakfast" of documents. Nearly two years ago, The Weather Company rebuilt its entire consolidated platform, called SUN (Storage Utility Network), which is deployed on Riak NoSQL databases from Basho Technologies and runs across four availability zones in the Amazon Web Services cloud. Today, the renewable compute platform gathers 2.25 billion weather data points 15 times per hour.
Overseeing this new IT platform is a data science team composed of 220 meteorologists and hundreds of engineers, each with in-depth domain knowledge of atmospheric phenomena. "When you're ingesting data from 800 different sources, you need to have some level of expertise tied to each one," says Koehler. "Most Java developers aren't going to be able to tell you, in intricate detail, the difference between a 72 and a 42 on a dew-point scale and how that may or may not impact a business."
Yet for all the IT leaders spearheading today's open data revolution, many argue that it's time the U.S. government played a greater role in the collection, cleaning and sharing of data. In fact, open data services provider Socrata reports that 67.9 percent of the everyday citizens surveyed for its 2010 Open Government Data Benchmark Study said they believe that government data is the property of taxpayers and should be free to all citizens. Such sentiment has already prompted the U.S. government to launch new services through its Data.gov website, enabling visitors to easily access statistical information. But that hasn't stopped techies from drawing up a laundry list of open data demands for government officials.
"A standard structure, a standard set of identifiers, greater data cleanliness, releasing data in a database-friendly format, making it machine-readable, making sure we can use the data without restrictions -- these are all ways that government can improve the data they're supplying," says Ryan Alfred, president of BrightScope, a provider of financial information and investment research.