Just last week, SAP unveiled a new big data bundle designed to let large organizations integrate Hadoop environments with SAP's HANA in-memory database and associated technologies. The bundled product uses the SAP HANA platform to read and load data from Hadoop environments and then do fast interactive analytics on the data using SAP's reporting and analytics tools.
SAS announced a similar capability for its High Performance Analytic Server a few weeks ago. HP, with technology gained in its acquisition of Vertica, and Teradata, with its Aster-Hadoop Adaptor, and IBM with its Netezza tool sets, offer or will soon offer similar capabilities.
The business has also attracted a handful of startups. One, Metamarkets, has developed a cloud-based service designed to help companies analyze copious amounts of fresh streaming data in real-time. At the heart of the service is an internally developed distributed in-memory, columnar database technology called Druid, according to the company's CEO Michael Driscoll. He compares Druid to Dremel in concept.
"Dremel was architected from the ground up to be an analytical data store," Driscoll said. Its column-oriented, parallelized, in-memory design makes it several orders of magnitude faster than a traditional data store, he said. "We have a very similar architecture," Driscoll said. "We are column-oriented, distributed and in-memory."
The Metamarkets technology, though, allows enterprises to run queries over data even before it is streamed into a data store, so it allows for even faster insight than Dremel, he said.
Metamarkets earlier this year released Druid to the open source community to spur more development activity around the technology. The demand for such technology is driven by the need for speed, Driscoll said. Hadoop, he said, is simply too slow for companies that need sub-millisecond query response times. Analytics technologies such as those being offered by the traditional enterprise vendors are faster than Hadoop but still don't scale as well as a Dremel or a Druid, Driscoll said.
Nodeable, another venture-backed startup, offers a cloud-hosted service called StreamReduce that is similar to the Metamarkets offering. StreamReduce is powered by Storm, an open source data analytics technology originally developed by BackType before it was acquired by Twitter last year. Storm, also used internally by Twitter, is designed to let enterprises run real-time analytics on streaming data.
Nodeable offers a connector to Hadoop so enterprises can use the service to run interactive queries against data stored in their Hadoop environment as well, CEO Dave Rosenberg said. Nodeable was launched as a cloud system management company but switched tracks after seeing an opportunity for big data analytics technology.
"We realized there was a lack of a real-time complement to Hadoop. We asked ourselves, how do we get real-time with Hadoop?" Rosenberg said. Services such as Nodeable's do not replace Hadoop, they complement it, Rosenberg said.
StreamReduce gives companies a way to extract actionable information from streaming data that can be stored in a Hadoop environment or in another data store for more traditional batch processing later, he said.