Hadoop is probably as mature as it's going to get

Five years ago, Hadoop came roaring into the mainstream as the solutions to all big data problems. Now that reality has settled in, it's time for a more realistic assessment

two oclock alarm time hour

We are now dead-center in the middle of the second decade of the 21st century. When big data mania got rolling around five years ago, the consensus that the future went by the name Hadoop was shockingly pervasive. Growth in the Hadoop market since that time showed that this was no fad. The unrelenting hype has at least had some grounding in Hadoop’s marketplace adoption and innovation.

Given that everybody pretty much agrees Hadoop is important, must we in the big data industry continue beating the drum for its proverbial “next big thing” status? Has Hadoop’s inflection point long since passed -- and is its maturation point fast approaching? When a segment shows all the signs of maturation, it’s time to tone down the marketing overkill. The Hadoop “next big thing” may now be as “big” as it’ll ever get, in terms of its share of the big data analytics market (though the overall market may itself continue growing like the proverbial gangbusters).

To determine whether Hadoop has reached this point, let’s review how far this segment has come and how it'll likely evolve going forward.

Startup activity is a clear sign of a growth market, and its decline is a strong signal of maturation. After a tremendous burst of startup formation in the early years of this decade, it would now appear that Hadoop platform, tool, and application vendors have settled into a familiar group of usual suspects. For example, every single vendor mentioned in this recent InformationWeek market overview was already in this space three to four years ago when I was Forrester’s Hadoop analyst. That’s one clear sign of a maturing market.

Another sign of Hadoop’s maturation is the fact that the chief demand drivers are essentially constant from year to year, reflecting a niche that is continuing to scratch the same itch. Once again, the cited article rattles off survey response numbers showing that users adopt Hadoop principally for unstructured data analysis, predictive customer analytics, sentiment analysis, and so on. None of that is appreciably different from what I saw in my primary research into the then-embryonic Hadoop market in 2011.

Yet another sign of segment maturation is the fact that the industry tends to hammer on the same themes over and over, year after year, as befits a solution space that has found its functional sweet spot. For example, the big data blogosphere continues to tiresomely debate the already settled issue of whether SQL has a future in the Hadoop ecosystem. The answer is decidedly yes, as evidenced by the range of alternative SQL access/analysis options from every major vendor listed in the cited article.

Related to that “hammering the same old themes” trend is the matter of Hadoop’s still-blurry market scope. As I stated in this Dataversity column from last April, Hadoop still has no clear boundaries (vis-à-vis NoSQL and other big data approaches), which was essentially what I had said three years previously in my Forrester days. Then and now, the Hadoop industry’s “identity crisis” stems in part from the group’s lack of standardization and failure to coalesce a unifying vision for what Hadoop is and can evolve into.

If you look at the Apache Software Foundation’s definition of Hadoop now, it still feels like a catch-all rather than a definitive architecture. For example, the recent inclusion of Spark into the scope of Hadoop feels as arbitrary as continuing to include Cassandra. Nobody in the industry seriously considers Spark anything other than a competitor to Hadoop, not a component of it. By contrast, Cassandra isn’t even the hottest open source, real-time, big data community out there, and its growth days seem to have waned considerably.

Also, you sense that a segment is starting to saturate its target market when discussions increasingly focus on its still-puny adoption rate among mainstream users. That’s front and center in the cited article’s discussion of its survey findings:

[InformationWeek’s] data suggests that train hasn't left the station just yet: Just 4% of companies use Hadoop extensively, while 18% say they use it on a limited basis…That is up from the 3% reporting extensive use and 12% reporting limited use of Hadoop in our survey last year. Another 20% plan to use Hadoop, though that still leaves 58% with no plans to use it.

If you’ve been in the analytics industry for more than a handful of years, this smacks of déjà vu. More than two decades into its existence as a discrete segment, the business intelligence (BI) market continues to agonize over low adoption rates among mainstream knowledge workers. Perhaps BI -- or Hadoop or any other big data segment -- was not fated to be as ubiquitously adopted as, say, smartphones.

That doesn’t mean Hadoop can’t develop into a hugely important and lucrative segment within its own well-defined niche. After all, nothing’s stopping a mature person from growing rich and popular as their hair fades to gray.