But judging from the turnout at a session at the Strata+Hadoop World 2015 conference in New York yesterday, the most urgent questions may be the simplest: What's the best way to get started? How do you demonstrate to the rest of the company that Hadoop is worth the effort?
The session, entitled "Real data, real implementations: What actual customers are doing," was chaired by Andrew Brust of Datameer and featured panelists from American Airlines, Kelley Blue Book, and American Express describing their companies' real-world achievements with Hadoop and what it took to make them happen. Clearly the subject had draw: The audience packed the room, with some attendees lining up along the back wall or sitting on the floor.
Brust's opened the panel with a question likely echoed by most enterprises users: How do we get started quickly in Hadoop?
American Express Publishing Corp.'s Kendell Timmers stressed not technology, but people -- specifically, an "information buddy system." Early adopters who wanted to work with Hadoop did all the original heavy lifting, figuring out how to get data into the system and what to download and work with. By the time a second wave of adopters had arrived, the first wave had already developed ways to support each other, such as creating a wiki or roster of "wizards," people who would take an hour out to field one-on-one questions.
Which makes more sense, Datameer's Brust asked: To hire outside Hadoop talent or train one's own people? Jeff Jarrell, a data architect at American Airlines, noted that while his company does a lot of internal grooming, "a lot of people [from outside] do want to get into this space." Many of the company's outside hires are from universities with data science programs. "[From there] we get 'adepts' -- first-year hires -- who are motivated to use the tech."
Timmers said American Express's approach was to do both -- get people from the outside who are a quick start and bring in new ideas, but also cultivate internal talent to leverage what they know about the business. "You already have a lot of valuable people who know about your data, and that's extremely valuable and not replaceable," he said.
This emphasis on the human element makes sense -- a shortage in Hadoop skills is a big reason why many Hadoop deployments don't provide the expected return on investments.
What about demonstrating proof of business value to the rest of the company? At American Express, Timmers said the proof came with a program that matched third-party offers to card members, using algorithms to determine the best matches. The original algorithm "took two and a half days to run" and produced poor matches; the new Hadoop-based match algorithm runs in "only four hours," produced far better results, and ended up enjoying wide adoption.
Ryan Wright, a manager of data management at Kelley Blue Book, said his company developed an entirely new reporting environment for the marketing side of the business that allowed them to budget better. This example underscores that enabling self-service reporting with Hadoop is one of the most tangible ways to demonstrate its value.