Allows for new types of applications
One huge benefit of Hadoop is its ability to be able to analyze huge data sets to quickly spot trends, Lazzaro says. For a major retailer, that could mean scouring Facebook or Twitter user data to learn what scarf colors were in fashion last season, to be able to compare that information with today's hot color trends to help determine what will sell this season.
"It gives you the ability to look back in time to look for opportunities for new sales," Lazzaro says. This plays out at Concurrent when the firm analyzes a commercial or ad for a car dealership. "We can look at the data to see who's watched the commercials; then you might have a targeted sales lead you can leverage to make a sale. You don't always know what you are looking for."
Traditional databases can work for many sorting and analysis needs, but with ultra-large data sets, Hadoop can be a much more efficient way to find things, Lazzaro says. "It's really built for handling that."
For their part, eBay's engineers "like being able to work with unstructured data ... and build new products for eBay quickly," Williams says. Because eBay engineers can access the firm's 300 million listings, historical information and vast amounts of related information, Williams says, "this allows us to understand customers and build experiences they want." It's not really about the structured versus unstructured issue; rather, "it's about our engineers being able to roll up their sleeves and work with our data like never before," he says.
In the last year, eBay has done "some really amazing things with Hadoop, including improvements in merchandising, buyer experience and how customers use the site," Williams says.
During the year, for instance, eBay staffers can see when customers start typing in Halloween queries and Christmas queries. "With that I can tell you the kinds of things people are looking for. We didn't comprehend this use of the data five years ago -- not at all."
Be careful out there
As good as Hadoop is, there are some cautions. First, "don't commit to or standardize on one vendor quite yet," because it's such a "turbulent" space right now, Forrester's Kobielus suggests. "The vendors are all continuing to rapidly evolve." On the other hand, that does create a "vibrant ecosystem," he says.
Marcus Collins, an analyst at Gartner, says it's up to the enterprise to get the expertise needed to get the most out of Hadoop. "It's asking for a level of analytics capabilities that many companies don't have today," he says. "You need to train your staff and invest in analytics, and that will put you in the best position to exploit this technology."
Another key consideration: Most shops will need to hire Hadoop specialists, who are in short supply, or will need to train in-house staffers. "It's not trivial to use," eBay's Williams says. "So we've put a lot of training in place so our engineers know how to use Hadoop and can write code. You're going to have to invest in your developers and program manager so they can become proficient users. Don't underestimate that."
Also be prepared for an organizational learning curve in terms of relying on an open-source system for a mission-critical application. Using it for a few under-the-radar kinds of projects is one thing, but it's another entirely to develop a massive system for all the world to see. Best be prepared to educate your management about the benefits of open source.
Another tip from Collins is to stay "intimately involved" with the project to make sure it goes as planned. "Don't just give your problems to your Hadoop vendor," he says. At the end of the day, "you're going to be running it."