Amid all the recent noise about Hortonworks and Pivotal, it's been easy to overlook another major Hadoop player, Cloudera, and the milestone release earlier this month of its own product, Cloudera Enterprise 5.
Now that the dust has settled a bit, it's time we took a look at how the company that was pushing "Hadoop for everything" last year is setting itself apart from the rest of the pack. It's not just by crafting Hadoop into a consistently deliverable product, it's by making Cloudera as broadly supported and unabashedly commercial as possible, by building a range of strategic partnerships with unexpected cohorts, and by using that position to further fortify Hadoop as an open source project.
The first way Cloudera has staked its own ground, both before and now, is through its business model. Where Hortonworks is bending over backward to ensure its enterprise-grade Hadoop distributions are pure open source products, Cloudera's more akin to an outfit like Pivotal, which has no issues with building a commercial product out of open source bits (though Cloudera makes a point of contributing back to the projects it takes from).
Also like Pivotal, Cloudera wants to make Hadoop into more than MapReduce by having it serve as a generic all-in-one data hub for all of an organization's harvested data, including live streamed data. The enterprise uptake on this idea has been mixed, not only because there are many kinds of data that don't lend themselves to the Hadoop model (transactional data, for instance), but because porting existing business apps to Hadoop, as opposed to simply using Hadoop's pool of analytics tools as-is, hasn't been a trivial exercise.
But easier migration into Hadoop and expanded data support and processing options aren't Cloudera's only claims, though it's been making noises in that direction. Rather, it's trying to stand out through partnerships and alliances, all of which are designed to garner big infusions of cash and a growing base of paying customers.
One prime example is Intel, which backed off from developing its its own bespoke edition of Hadoop and switched to Cloudera software, to the tune of $900 million, for its own analytics work. Intel characterized the move as "the single largest data center technology investment in its history."
But it wasn't the cash alone or nabbing the title of Intel's top Hadoop supplier that made the partnership worthwhile. Rather, Cloudera saw it as a chance to partner with a company with a historical interest in being a fixture in the data center. Cloudera chief strategy officer Mike Olson put it this way: "When Linux began to break, they crafted a relationship with Red Hat. When virtualization emerged as a new technology, they invested a considerable sum in VMware."
Not long before that, Cloudera racked up $160 million in investments from sources as diverse as T. Rowe Price and Google Ventures. Cloudera has also been positioning itself as the go-to Hadoop vendor for those who want to run a commercially supported Hadoop offering in a public cloud, with IBM SoftLayer, Verizon Terremark, and Amazon EC2 all signing up.
Another key way Cloudera is trying to stand apart puts it in contrast with not only Pivotal but IBM. When Matt Asay looked at the various big-name Hadoop vendors, he broke ranks with Forrester Research's assessment of who was most valuable in that space and named Cloudera and Hortonworks as two of the best bets. His rationale: Those companies contribute back the most to Hadoop's development, while other outfits like IBM and Pivotal seem content to simply take without giving much back. "Only [Cloudera and Hortonworks] are in a credible position to influence Hadoop's roadmap, and support it best," he wrote.
That said, Cloudera's next big step doesn't appear to revolve most around some major upgrade to Hadoop as a project. Rather, it lies in the looming possibility that the company will someday go public. Not that this is set to happen anytime soon, since Cloudera wants the stars to be in proper alignment first and won't simply do it because it needs the money (which, after its Intel deal, probably won't be an issue for the foreseeable future).
More worrisome is the prospect of a newly public Cloudera becoming a takeover target for an IBM or an Oracle -- two of the many outfits with an interest in Hadoop borne more of competitive interest than native enthusiasm for the technology. In the meantime, Cloudera's strategy remains one of staking out a space for itself between business sense and open source savvy.
This story, "How Cloudera plans to stand out from the Hadoop herd," was originally published at InfoWorld.com. Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog. For the latest developments in business technology news, follow InfoWorld.com on Twitter.