Microsoft is developing a connector that will allow Excel users to download and analyze output from Hadoop, potentially opening the open-source data processing platform to a much wider audience.
Microsoft is working on the connector with Hortonworks, a Yahoo spinoff that offers a Hadoop distribution and commercial support services.
[ Also on InfoWorld: Look before you leap into Hadoop. | Also read "Enterprise Hadoop: Big data processing made easier." | Explore the current trends and solutions in BI with InfoWorld's interactive Business Intelligence iGuide. | Get familiar fast with Office 2010's key applications -- Word, Excel, PowerPoint, and Outlook -- with InfoWorld's set of Office 2010 QuickStart PDF guides. | Stay abreast of key Microsoft technologies in our Technology: Microsoft newsletter. ]
"What makes this announcement significant is that Microsoft is opening up Apache Hadoop to literally millions of new users," said Hortonworks CTO Eric Baldeschwieler. "There are many more millions of Excel and PowerPivot users that can now derive value from Apache Hadoop using software that is already very familiar to them."
The connector was among several Hadoop-related open-source projects that Microsoft and Hortonworks announced Tuesday at the O'Reilly Strata Data Conference, being held this week in Santa Clara, California. The two companies formed a partnership last year to adapt Hadoop to the Windows ecosystem.
The connector will be an ODBC (Online Database Connector) that interacts with Hadoop through the Hive data warehouse system. Users will be able to analyze data downloaded from Hive in Excel, using tools such as Excel PowerPivot.
"Microsoft's ODBC driver for Hive is something they've built so that Excel, Power Pivot and other Microsoft tools can connect to Hadoop (via Hive) more easily," said Shaun Connolly, Hortonworks vice president of corporate strategy, via email. "Hortonworks is focused on working with Microsoft to bring this technology into open source so it can be more widely available to and used by the Apache Hadoop community,"
Microsoft has already taken some efforts to embed Hadoop in its ecosystem. The company has released a Hadoop connector for SQL Server. It also offers connectivity to an instance of Hadoop on its Azure cloud service, currently as a developer preview.