Recognizing Apache Hadoop's growing presence in the field of data analysis, Dell will start selling servers preloaded with the open source data processing platform, the company announced Thursday.
The package "is a direct response to feedback we've been hearing from our customers," said Joseph George, director of cloud marketing for Dell. A significant portion of Dell's enterprise customers are considering the use of or are already running Hadoop.
"Hadoop is becoming a de facto standard," George said. "We've built a reference architecture on top of hardware that is attuned to this environment."
Created by search specialist Doug Cutting, Apache Hadoop has been increasingly used by organizations to sift though large sets of unstructured data, such as server logs.
Dell's Cloudera Solution for Hadoop uses a bundle of Hadoop software offered by Cloudera, including the Cloudera Distribution of Hadoop (CDH) and the Cloudera Enterprise suite of management tools. For managing the deployment of software, the package also includes a copy of Dell's own Crowbar software.
CDH is a collection of commonly used Hadoop components, including Hadoop itself, Hive, Pig, HBase, Zookeeper, Whirr, Flume, Hue, Oozie, and Sqoop. The servers can be outfitted with Red Hat Enterprise Linux, either version 5.6 or 6, CentOS, Ubuntu, or Suse operating systems. Users can order the servers with the software fully installed, or they could use Crowbar to install the software themselves.
On the hardware side, the package can come with either Dell PowerEdge C2100, C6100, or C6105 servers. The PowerEdge C-series servers are uniquely suited for Hadoop's multiserver deployments because of their modest physical size and power usage, George said. It also includes a set of PowerConnect 6248 48-port Gigabit Ethernet Layer 3 switches. A deployment based on the reference architecture could scale from 6 nodes to 720 nodes.
Dell will also offer training and technology support.
The package will offer organizations "lower risk" and "a faster time to production," when compared to buying the servers separately and installing the software on them by hand, George said. Dell expects that the initial customers for this package will be financial services firms, utilities, telecommunication companies, research institutions, retail businesses and Internet media outlets.
The cost of a minimum configuration would run from $118,000 to $124,000, depending on the support options. The package includes a one-year subscription to Cloudera's support and updates. The minimum configuration would consist of six PowerEdge C2100 servers -- two management nodes, one edge node, and three slave nodes -- as well as six Dell PowerConnect 6248 switches to bind these servers together.
This package is similar to another offering that Dell unveiled last week for setting up OpenStack cloud deployments. Both packages were developed by Dell's Next Generation Compute Solutions Group. Dell CEO Michael Dell has expressed his intent to move Dell more into sales of integrated packages of hardware, software and support, or "solutions."
With both of the releases from the Next Generation Compute Solutions Group, Dell "extends our focus around open source solutions for customers," said John Igoe, executive director of cloud solutions.