Apple quietly joins ranks of Hadoop users

Like Yahoo, Facebook, and Twitter, Apple embraces open source, data-crunching framework for mobile ads business

Whether or not Hadoop fits Steve Jobs's definition of "open source," Apple appears to be embracing the increasingly popular framework, built for supporting data-intensive distributed applications.

A job posting from Apple indicates that the company is seeking a senior engineer schooled in Hadoop to "design and build [a] scalable Hadoop based ETL infrastructure" for its mobile-advertising program, iAds, as first reported on The Register.

Hadoop is already drawing attention from enterprises across various industries for its ability to house and analyze vast amounts of data over distributed computing environments -- aka the cloud. The framework uses MapReduce to spread out the data processing over a massive numbers of servers, then combines the results. Observers have declared Hadoop a killer application for the cloud.

Among the high-profile users of Hadoop is Yahoo, which has enhanced the platform to prep it for business intelligence applications in the enterprise. The company has an M45 cluster build on Hadoop, which spans 4,000 processors and 1.5 petabytes of disk space.

Yahoo's application of Hadoop has already made a mark in the world of research and academia: The company has opened up the facility for big data research to eight universities across the United States. Recently, a team of Yahoo researchers used the company cloud to set a new record in the field of mathematics by creating the longest version of pi to date.

Other users include eBay, which is building an 8,500-processor 16-petabyte Hadoop cluster; the New York Times, which uses the platform for tasks such as batch processing of huge volumes of image data, as well as text analytics and data mining; and social media sites such as Facebook and Twitter.

Hadoop's adoption in the enterprise can be attributed in part to the fact that IT companies such as Yahoo have been steadily grooming the platform for corporations. IBM last May announced a new portfolio of BI solutions and services built on Hadoop called IBM InfoSphere BigInsights. Additionally, Informatica announced last month that it had modified its business intelligence software to work with Cloudera's Hadoop distribution. Also this year, Cloudera and Quest announced Ora-Oop, free software for connecting Hadoop to Oracle databases.

This article, "Apple quietly joins ranks of Hadoop users," was originally published at Get the first word on what the important tech news really means with the InfoWorld Tech Watch blog.