Review: DataRobot aces automated machine learning

DataRobot’s end-to-end AutoML suite not only speeds up the creation of accurate models, but can combine time series, images, geographic information, tabular data, and text in a single model.

Review: DataRobot aces automated machine learning
Thinkstock
At a Glance

Data science is nothing if not tedious, in ordinary practice. The initial tedium consists of finding data relevant to the problem you’re trying to model, cleaning it, and finding or constructing a good set of features. The next tedium is a matter of attempting to train every possible machine learning and deep learning model to your data, and picking the best few to tune.

Then you need to understand the models well enough to explain them; this is especially important when the model will be helping to make life-altering decisions, and when decisions may be reviewed by regulators. Finally, you need to deploy the best model (usually the one with the best accuracy and acceptable prediction time), monitor it in production, and improve (retrain) the model as the data drifts over time.

AutoML, i.e. automated machine learning, can speed up these processes dramatically, sometimes from months to hours, and can also lower the human requirements from experienced Ph.D. data scientists to less-skilled data scientists and even business analysts. DataRobot was one of the earliest vendors of AutoML solutions, although they often call it Enterprise AI and typically bundle the software with consulting from a trained data scientist. DataRobot didn’t cover the whole machine learning lifecycle initially, but over the years they have acquired other companies and integrated their products to fill in the gaps.

As shown in the listing below, DataRobot has divided the AutoML process into 10 steps. While DataRobot claims to be the only vendor to cover all 10 steps, other vendors might beg to differ, or offer their own services plus one or more third-party services as a “best of breed” system. Competitors to DataRobot include (in alphabetical order) AWS, Google (plus Trifacta for data preparation), H2O.ai, IBM, MathWorks, Microsoft, and SAS.

The 10 steps of automated machine learning, according to DataRobot: 

To continue reading this article register now

How to choose a low-code development platform