From the number of times you've heard the word "Hadoop," you'd think it referred to some magic elixir for making sense of big data. In reality, Hadoop is an open source framework for distributed data storage and processing -- with enormous analytics potential for those who know how to use it.
To demystify Hadoop, and to get a personal perspective from one of the leading lights in the space, IDG Enterprise Chief Content Officer John Gallant and InfoWorld Editor in Chief Eric Knorr turned to Mike Olson, CEO of Cloudera. The hour-long interview, an edited version of which appears below, is part of the ongoing IDG Enterprise CEO Interview Series.
[ Also on InfoWorld: What Hadoop can and can't do. | Harness the power of Hadoop with InfoWorld's 7 top tools for taming big data. | Discover what's new in business applications with InfoWorld's Technology: Applications newsletter. ]
Olson began his career in the '80s and '90s building and then later selling and managing companies that developed relational database products. In 2000 he became CEO of Sleepycat Software, makers of the open source embedded database engine Berkeley DB. He later negotiated the sale of Sleepycat to Oracle in 2006. Olson stayed with Oracle as vice president of embedded technologies for two years; shortly after departing, he "stumbled across" Hadoop. "When I saw how it was being used in the consumer Internet...I got excited and thought there would be an opportunity in traditional enterprises," he says.
As it turned out, three other entrepreneurs -- Christophe Bisciglia (Google), Amr Awadallah (Yahoo), and Jeff Hammerbacher (Facebook) -- all felt inspired to start a Hadoop venture at roughly the same time. "We banded together in the summer of 2008 to create just one, rather than four, such companies," says Olson. A year later, Doug Cutting, co-creator of the Hadoop project itself, joined Cloudera as chief architect.