Microsoft, Hortonworks to link Excel and Hadoop

Companies will develop an ODBC and a JavaScript library for Hadoop, potentially opening the open-source data processing platform to a wider audience

Microsoft is developing a connector that will allow Excel users to download and analyze output from Hadoop, potentially opening the open-source data processing platform to a much wider audience.

Microsoft is working on the connector with Hortonworks, a Yahoo spinoff that offers a Hadoop distribution and commercial support services.

[ Also on InfoWorld: Look before you leap into Hadoop. | Also read "Enterprise Hadoop: Big data processing made easier." | Explore the current trends and solutions in BI with InfoWorld's interactive Business Intelligence iGuide. | Get familiar fast with Office 2010's key applications -- Word, Excel, PowerPoint, and Outlook -- with InfoWorld's set of Office 2010 QuickStart PDF guides. | Stay abreast of key Microsoft technologies in our Technology: Microsoft newsletter. ]

"What makes this announcement significant is that Microsoft is opening up Apache Hadoop to literally millions of new users," said Hortonworks CTO Eric Baldeschwieler. "There are many more millions of Excel and PowerPivot users that can now derive value from Apache Hadoop using software that is already very familiar to them."

The connector was among several Hadoop-related open-source projects that Microsoft and Hortonworks announced Tuesday at the O'Reilly Strata Data Conference, being held this week in Santa Clara, California. The two companies formed a partnership last year to adapt Hadoop to the Windows ecosystem.

Microsoft and Hortonworks are also developing a JavaScript framework, one that will allow JavaScript programs to explore Hadoop data. And they are working on a series of patches for the Hadoop core that will allow the software to be run on Windows Server.

The connector will be an ODBC (Online Database Connector) that interacts with Hadoop through the Hive data warehouse system. Users will be able to analyze data downloaded from Hive in Excel, using tools such as Excel PowerPivot.

"Microsoft's ODBC driver for Hive is something they've built so that Excel, Power Pivot and other Microsoft tools can connect to Hadoop (via Hive) more easily," said Shaun Connolly, Hortonworks vice president of corporate strategy, via email. "Hortonworks is focused on working with Microsoft to bring this technology into open source so it can be more widely available to and used by the Apache Hadoop community,"

Development for the JavaScript framework will follow a similar path, Connolly said. Microsoft built the core platform for its own products, and Hortonworks will modify it for broader use.

Neither the connector nor the Javascript framework are available now, but they will be released in the near future as open source, Connolly said.

Microsoft has already taken some efforts to embed Hadoop in its ecosystem. The company has released a Hadoop connector for SQL Server. It also offers connectivity to an instance of Hadoop on its Azure cloud service, currently as a developer preview.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's email address is Joab_Jackson@idg.com.

Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies