Big data is a powerful lure, promising to turn the massive and ever-increasing volumes of data inside an organization into a pool of intelligence that promises deep, actionable insight into every aspect of a business. However, that lure can lead you into an expensive trap if you don't plan carefully.
"Big data has big spending risks," says Jeff Muscarella, IT spend management consultant with NPI Financial. Muscarella warns that big data projects can easily ring up seven-figure price tags after you finish paying for the hardware, software, and services, and sometimes the glowing business cases presented by vendors lose their luster when you look closely. "A lot of times, when you pull them apart, they're not as rosy as they seem," he says.
That's not to say that harnessing the power of big data is a mistake, Muscarella explains. But it does mean that organizations seeking to base their decisions on data need to start by gathering real data on how a bigdata project will benefit the business.
"This is not just new technology," Muscarella says. "It's new technology solving a business problem that we often haven't proved. That's important for CIOs to keep in mind. The business is going to be coming to them with all sorts of half-baked ideas for what they can do with big data. They have to ask: Will it really drive revenue? How and for how long? What will it take to build it? They need to make sure they have a crisp focus on the mission; that it is going to have a return on investment."
For big data, fire bullets not cannons
When you're exploring a big data project, don't dive in head first, Muscarella warns. Start with open source tools like Apache Hadoop and build a test case.
"You want to really pilot these things," Muscarella says. "Pick something that's manageable. Start on a small scale to prove your hypothesis. For instance, if we could mine this sensor data or these Web clicks or these purchasing habits, would what we do with these results improve our business."
"Don't get trapped into building the infrastructure yet," he adds. "Prove it first and then go back and architect your solution. Assume that however you solve the problem, you're probably going to throw it away and start over. That's OK because at least you proved the business need before you spent a lot of money."
Once you proven the business need, it's time to look at the infrastructure required to manage big data. Big data projects scale to petabytes and potentially exabytes of data, so making sure you get your storage infrastructure right is essential. Muscarella says that despite vendors' arguments in favor of standardizing on one storage provider, it's better to leverage storage virtualization technology to introduce competition in low-risk, nonstrategic areas of your architecture.
"You don't want to standardize on just one vendor," he says. "You might want to do some of it in the cloud and you might want to do some in your internal data centers. You want to keep your options open. Once you get locked in, you truly are locked in."