Despite the growing interest in big data platforms, it may be some time before organizations will be able to deploy a standardized big data software stack, concluded a panel of speakers Wednesday during a virtual panel hosted by GigaOm.
The panelists agreed that a standardized stack of big data analysis software would make it easier to develop large scale data analysis systems, in much the same way the open source LAMP stack engendered a whole generation of Web 2.0 services over the past decade. But the ways software such as Hadoop can be used vary so much that it may be difficult to settle on one core package of technologies, the panelists said.
[ Explore the current trends and solutions in BI with InfoWorld's interactive Business Intelligence iGuide. | Discover what's new in business applications with InfoWorld's Technology: Applications newsletter. ]
LAMP is an abbreviation for a set of software programs that work very well together: Linux, the Apache Web server, the MySQL database, and a set of programming languages -- Perl, Python, and PHP.
LAMP "provided a common framework upon which people could build. It was freely available. It was easily understood. It ran on almost anything. It created a foundation upon which a generation of start ups grew up," said independent consultant Paul Miller, who moderated the panel, "Designing for Big Data: The New Architectural Stack."
"As we're beginning to see an explosion of interest in big data, do we need a stack that is similarly ubiquitous? Do we need a LAMP stack for big data?" Miller asked.
All agreed that not having a standardized stack slows deployments of big data systems. "There isn't a standard stack, and people aren't clear which piece works best for a particular workload. There's a trial and error period going on now," said Jo Maitland, a research director covering cloud technology for GigaOm Pro.
One reason LAMP was so popular was that its users all had similar needs, all based around putting services online, pointed out Mark Baker, Canonical Ubuntu server product manager. The needs around analysis, on the other hand, tend to vary from business to business, and change often, he noted.
Large Web services companies that use Hadoop, such as eBay and Twitter, are running in a "continuous beta," and they hire a lot of technically competent staff to handle the pace of rapid change," said Mark Staimer, president of Dragon Slayer Consulting.
"Having a constantly evolving platform and stack is fine for them. They have the process and culture within the company to manage it," Staimer said. The more traditional "brick and mortar" companies are "much more conservative," Staimer added. "They like to see a fully baked solution."
Arriving at such a stack may be difficult, given the variety of technologies available, and the degrees of difficulty inherent in connecting them together in various configurations.
"Now we have loads of different pieces out there that you can plug together. Just in the database space, there is MongoDB, Cassandra, HSpace," Maitland said. All this choice "makes it more difficult for people. We're in a mashup situation with all these different components."