The market for software related to the Hadoop and MapReduce programming frameworks for large-scale data analysis will jump from $77 million in 2011 to $812.8 million in 2016, a compound annual growth rate of 60.2 percent, according to a new report released Monday by analyst firm IDC.
Hadoop is an open source implementation of the MapReduce framework. It is hosted at the Apache Software Foundation along with a number of supporting software projects, including the Hadoop Distributed File System (HDFS) and Pig programming language.
[ Find out which set of tools came out on top in InfoWorld Test Center's review: "Enterprise Hadoop: Big data processing made easier." | Learn how to work smarter, not harder with all the tips and trends programmers need to know in InfoWorld's developers survival guide. Download it today! | For more on software development, subscribe to InfoWorld's Developer World newsletter. ]
MapReduce and Hadoop are based on the principle of splitting up large amounts of data and then processing the chunks in parallel across large numbers of nodes. It's closely associated with the industry buzzword "big data," which refers to the ever-larger volumes of information, particularly of unstructured form, being generated by websites, social media, sensors, and other sources.
Overall, Hadoop has enjoyed a steady stream of interest from commercial analytics and database vendors in recent years, who have begun offering commercial products and services for it.
While "fantastic and largely unsupportable claims have been made" regarding Hadoop and MapReduce's use cases and benefits, "there can be no doubt that it does provide a relatively low-cost means of deriving considerable value from very large collections of unorganized data," IDC analysts Carl Olofson and Dan Vesset wrote in the report.
Therefore, the conditions are right for significant growth in the Hadoop-MapReduce "ecosystem," according to IDC.
This year, "Leading adopters in the mainstream IT world will move from 'proof of concept' to real value," the report states.
However, lack of qualified talent will limit the technology's rise during the next two to three years, it adds.
The coming years will also see a "battle between open source purists, who believe that the core of Hadoop deployment must be based purely on the Apache project code," according to IDC. However, most IT organizations will use a mix of commercial and open source components in their Hadoop environments, the report adds.
Still, "competition between open source vendors and their closed source counterparts may force lower license fees from the latter group, resulting in somewhat slower software revenue growth than would be the case if open source projects did not represent so large a component of this market space."
IDC is a subsidiary of IDG News Service's parent company, International Data Group.
Chris Kanaracus covers enterprise software and general technology breaking news for The IDG News Service. Chris's e-mail address is Chris_Kanaracus@idg.com