Review: Amazon SageMaker scales deep learning

AWS machine learning service offers easy scalability for training and inference, includes a good set of algorithms, and supports any others you supply

1 2 Page 2
Page 2 of 2

The new, scalable, container-based Azure Machine Learning Services are comparable to SageMaker, although they are still in preview. Microsoft Azure Machine Learning Studio, an older Microsoft product that is still supported, isn’t as scalable and doesn’t have as much support for deep neural network frameworks (e.g. TensorFlow) or GPUs as Azure Machine Learning Services or the Azure Data Science Virtual Machine.

Azure Machine Learning Studio has many more machine learning (but not deep learning) algorithms built-in than Amazon SageMaker, and has a nice drag-and-drop machine learning workflow designer that SageMaker lacks. Azure Machine Learning Studio includes ETL from a bunch of databases and big data sources, while SageMaker uses S3 buckets for data input. But AWS offers other services, such as Amazon Glue, that can create the data sets in S3 from other data sources.

Google Cloud Machine Learning Engine is another product comparable to Amazon SageMaker. Google is also starting to release Cloud AutoML services that you can train and customize easily, starting with AutoML Vision.

Finally, H2O Driverless AI builds on H2O.ai with support for multiple GPUs to perform deep learning automatically with feature engineering and hyperparameter optimization. It’s a little ahead of where Amazon SageMaker is right now in terms of automation, although I would expect AWS to add some form of feature engineering automation to SageMaker in the future. AWS hasn’t announced such support, but the company is certainly aware of how much it can speed up the data science process.

Amazon SageMaker and you

Which cloud machine learning product should you choose? They all have attractions, they all have similar underpinnings, and they all are improving rapidly. At this point I can’t definitely say that you should pick one or the other, but any one of them is worth your evaluation, including Amazon SageMaker.

As far as SageMaker goes, you should find that it meets your deep learning and machine learning needs as long as you know how to do feature engineering, you can program in Python, and you understand how to pick an algorithm. AWS has always favored offering a few battle-tested machine learning algorithms over giving you every algorithm known to man; with SageMaker you also have a way of plugging in any algorithm you want.

As for scaling and deployment, SageMaker takes most of the sting out of the infrastructure portion. SageMaker also makes it easy to keep your costs low by dynamically creating powerful instances for jobs to train models, and destroying the instances when the jobs are complete. How low? My first SageMaker notebook, a factorization machine training on MNIST data, cost me a whopping nine cents to run—and three cents of that was for S3 storage.

At a Glance
  • Amazon SageMaker is a highly scalable machine learning and deep learning service that supports 11 algorithms of its own, plus any others you supply. Hyperparameter optimization is still in preview, and you need to do your own ETL and feature engineering.

    Pros

    • Excellent, easy scalability for training and inference
    • Models provided with service are robust and perform well
    • Easy on-demand access to high-end GPUs
    • Able to use TensorFlow, MXNet and other machine learning and deep learning frameworks

    Cons

    • Data must be in S3, although other AWS services can perform ETL to S3
    • Fewer models provided than comparable services, although you can use your own
    • You need to do your own feature engineering
    • Hyperparameter optimization is still in preview

Copyright © 2018 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
How to choose a low-code development platform