Adatao enhances Hadoop with natural-language queries and machine learning

By leveraging machine learning, the Data Intelligence Platform hopes to make querying Hadoop as easy as typing questions

machine learning robot touch screen

Last year, data visualization company Adatao -- spearheaded by the former engineering director of Google Apps -- announced its Data Intelligence Platform, a tool set for querying Hadoop data that's meant to be as easy as putting together a document in Google Docs.

Today Adatao announced general availability for the tool set, which claims to put machine intelligence at the disposal of nontechnical users making natural-language queries with Hadoop data.

Christopher T. Nguyen, CEO of Adatao, describes the goals for the Data Intelligence Platform as "big data for the iPhone generation." More than simply creating visually appealing reports from data, this involves putting machine-learning techniques within reach of less technical users.

In a demo of the product's visual-reporting system, called Adatao Narratives, Nguyen showed how the natural-language querying process worked, using a data set that listed airline delays over the course of a year. By typing "show relationship between arrdelay [arrival delay] and month," he produced a graph depicting that relationship. (Those who don't want to use natural language can work with SQL queries, R, or Python.)

The lowest levels of the product's stack are familiar big data infrastructures: Hadoop, Amazon Redshift, conventional DBMSes, and the rest. Other common big data access tools -- such as Spark, Presto, and Cloudera Impala -- sit on top and are used to perform raw queries.

From there, Adatao adds a machine-learning layer called Predictive Engine, and above that an application-building layer used to build the likes of Narratives or Adatao's dashboarding tools. An open source layer called Distributed Data Frame allows vendors or users to create their own abstractions to data sources by way of Spark.

Because machine learning is such a broad term, it can be tough to assess its use in a given product. While Adatao leverages and connects with a slew of open source technologies, the product itself is proprietary, so it's less easy to tell specifically how machine learning is leveraged under the hood.

Most of what was profiled in the demo falls in the realm of predictive analytics, an area of machine learning where cloud vendors such as Amazon, Microsoft, and IBM are competing to build easy-to-leverage solutions. If Adatao works to provide a more intuitive way to leverage those products, it will fare better than if it tries to compete only on the algorithm side.

Copyright © 2015 IDG Communications, Inc.

How to choose a low-code development platform