As enterprise adoption of Hadoop booms, the pool of IT personnel able to build and maintain deployments hasn't kept pace. In May, analyst firm IDC pegged the compound annual growth rate for the Hadoop software market at more than 60 percent, forecasting an ascent to $812.8 million in 2016 from a base of $77 million in 2011.
The Apache Foundation's Hadoop distributed computing technology initially staked out a role in search engines. Yahoo helped get the software off the ground, announcing what it termed the largest Hadoop production application in 2008. Distributions of the open-source software have since boosted availability beyond the earliest adopters. Cloudera kicked off its Hadoop distribution in 2009, followed by Hortonworks and MapR Technologies.
As Hadoop penetrates a broadening range of industries, from publishing to agriculture, IT departments look to Hadoop distributors and specialized consulting firms to fill Hadoop skills gaps. CIOs and IT managers look for outside help to launch projects, write code, and generally navigate the Hadoop ecosystem. IT organizations also tap channel partners for training as they seek to grow in-house Hadoop talent.
How-to: What Hadoop can and can't do
Staff supplementation and training are about the only options for organizations struggling to hire Hadoop experts.
"Short supply is an understatement," says Geoffrey Weber, CIO at Shutterfly, describing the scarcity of Hadoop expertise. "I think, realistically, it is virtually impossible for a company our size to expect to go into the market and find a pocket of...Hadoop veterans."
Shutterfly, which offers an Internet-based personal publishing service, is hardly a small business -- its 2011 revenue exceeded $473 million -- but the company competes with social media giants like Facebook and LinkedIn for a limited supply of Hadoop expertise.
"If you were, yourself, a Hadoop expert, coming out of Yahoo and part of the original team, your skills and experience are almost unique. You can name wherever you want to work and how much you get paid," Weber says. "It's difficult for us to go out and acquire those kinds of skills."
For large-scale deployments, Hadoop skills in short supply
Hadoop targets data sets too large and cumbersome to manage and analyze using conventional database technology. It does this by dispersing big data processing tasks across multiple computing nodes.