Red Hat, OpenStack startup Mirantis, and Hadoop vendor Hortonworks announced today at the OpenStack Summit that they are throwing resources behind Project Savanna, which is aimed at delivering the Hadoop big data engine atop the OpenStack cloud platform. The goal of the open source endeavor is to enable OpenStack users to easily provision and manage elastic Hadoop clusters on the cloud platform for such purposes as delivering "analytics as a service" for ad-hoc or bursty analytic workloads, similar to Amazon Web Services' Elastic MapReduce.
"The cloud is a logical deployment platform for Apache Hadoop," said Bob Page, Hortonworks' vice president of products. "Coupled with the fact that Hadoop is a net new workload for many organizations, deployment on OpenStack is a logical fit."
Project Savanna is designed to function as an OpenStack component that can be managed through a REST API the OpenStack Dashboard. According to the project documentation, it supports different Hadoop distributions, functioning as a pluggable system of Hadoop installation engines as well as integrating with vendor-specific management tools such as Apacheth Ambari or Cloudera Management Console. Further, Savanna will provide pluggable integration with external monitoring systems such as Nagios or Zabbix.
The project will also include predefined templates of Hadoop configurations with modifiable parameters.
One of the initial project goals is to create an integration point for third-party Hadoop provisioning and management frameworks such as the Apache Ambari, a Hadoop monitoring and lifecycle management program. By establishing an integration point, backers of Project Savanna aim to provide enterprise users with a way to quickly provision Hadoop distributions via OpenStack APIs and the OpenStack Dashboard.
In addition, Savanna will allow the utilization of unused compute power from a general-purpose OpenStack infrastructure pool for Hadoop workloads, according to the trio's announcement.
By the end of September, Project Savanna developers aim to deliver the analytics-as-a-service capability, which will entail developing an API capable of executing MapReduce jobs without exposing details of underlying infrastructure (similar to AWS EMR). They also expect to have completed a UI for ad-hoc analytics queries based on Hive or Pig.
More information about the project is available from the Savanna website.
This story, "Red Hat, Mirantis, and Hortonworks unite behind Hadoop on OpenStack," was originally published at InfoWorld.com. Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog. For the latest developments in business technology news, follow InfoWorld.com on Twitter.