Bossie Awards 2017: The best machine learning tools

InfoWorld picks the best open source software for machine learning and deep learning


The best open source software for machine learning

It’s just possible that you’ve heard something about “machine learning.” Safely ignored by most of us these past many years, machine learning has clearly come into its own in the cloud era, with so much data now available to chew on. So many tools and libraries for machine learning are popping up, we felt compelled to break them out into their own category.  

[ InfoWorld presents the Best of Open Source Software Awards 2017The best open source software development tools. | The best open source cloud computing software. | The best open source databases and analytics software. | The best open source networking and security software. ]


A neural network you can launch in your browser? Of course you want that. Synaptic is deceptively simple. I mean creating a neuron is var A = new Neuron(). Those with more experience will note that the difficulty comes in figuring out how to really write the application and, yeah, maybe you need a bit more data science to pull that off. Synaptic also includes useful practical architectures like LSTM and feed-forward. If you want to learn neural networks, this is an understandable, nicely documented code base with great tutorials.

— Andrew C. Oliver


If you’re a Python developer with a machine learning background you probably know that Scikit-learn has most of what you need. If you did not start with Scikit-learn or this awesome cheatsheet and you discover that the algorithm or implementation you need isn’t actually in Spark MLlib, well, you’ll probably turn to Google and find Scikit-learn. It supports a ton of machine learning and statistical algorithms and provides extensive documentation. So if you’re a machine learning type and comfortable with Python, you’ll want to make sure Scikit-learn is in your toolbelt.

— Andrew C. Oliver


Caffe2 is a lightweight and modular deep learning framework that emphasizes portability while maintaining scalability and performance. It was built as an improvement on Caffe 1.0. Whereas Caffe was designed for convolutional neural networks (CNNs) for vision, Caffe2 also handles other types of neural networks including RNN and LSTM networks for machine translation. Caffe2 is able to scale to distributed systems with multiple GPUs, and is being used in production at Facebook. Existing Caffe models can be upgraded to Caffe2 with a script.

— Martin Heller


By abstracting away the complexity of distributed machine learning, H2O makes it easy for organizations to build data models and workflows using popular languages such as R, Python, Scala, and Java. H2O includes commonly used machine learning algorithms, which are implemented in-memory across a distributed cluster, and it can read from HDFS, S3, SQL, and NoSQL data sources. Models can be exported as POJOs and managed with traditional SDLC tools.

— Steven Nunez


Although deep learning frameworks are seemingly two-a-penny at the moment, PyTorch is one to watch. Created and used by Facebook, it provides a powerful, dynamic, and imperative method of defining cutting-edge deep learning architectures. Despite the power, it also includes plenty of support code and documentation for beginners to pick up quickly and easily. Google’s Tensorflow is still the most popular deep learning framework, but expect to see interest in PyTorch take off.

— Ian Pointer


CatBoost is an algorithm for gradient boosting on decision trees that was developed at Yandex, the Russian search engine company, to perform ranking tasks, do forecasts, and make recommendations. CatBoost is available as a command-line application, an R package, and a Python module. It has been used to classify collisions at the Large Hadron Collider. CatBoost reduces overfitting and handles non-numeric categorical features without requiring you to encode the categories as numbers.

— Martin Heller


Math geeks, data scientists, and machine learning experts know what gradient boosting is. If you don’t know, read Kaggle’s gradient boosting explainer. If you have a ranking, classification, or regression problem and you want to use gradient boosting then XGBoost is a good solution. It is multi-language and multi-platform, supports GPUs and clouds, and has nice tutorials and getting started documentation. Moreover, it improves on the basic gradient boosting algorithm with tricks and tweaks to make it perform better.

— Andrew C. Oliver

GNU Octave

GNU Octave lets you do the same kinds of numerical computation, models, visualization, and experiments as MatLab does, but it is “free” as in Free Software Foundation and licensed under the GPL. Unlike similar projects Octave treats incompatibility with MatLab as a bug and aims to be MatLab compatible. If you’re going to take that Stanford Machine Learning course on Coursera, you can use Octave instead of Matlab. I don’t know of a better endorsement than that.

— Andrew C. Oliver


TensorFlow is a software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. Currently, the best-supported client language is Python. Experimental interfaces for executing and constructing graphs are also available for C++, Java, and Go, plus a C-based client API and a few community-contributed bindings.

— Martin Heller

Microsoft Cognitive Toolkit

The Microsoft Cognitive Toolkit, also known as CNTK, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. CNTK can realize and combine popular model types such as feed-forward DNNs, convolutional nets (CNNs), and recurrent networks (RNNs/LSTMs). It implements stochastic gradient descent (SGD), error back-propagation learning with automatic differentiation and parallelization across multiple GPUs and servers.

— Martin Heller

Apache MXNet

Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scaling effectively to multiple GPUs and multiple machines.

— Martin Heller

Apple Core ML

Core ML is Apple's framework for integrating trained machine learning models into an iOS or MacOS app. Core ML supports Apple’s Vision framework for image analysis, Foundation framework for natural language processing, and GameplayKit framework for evaluating learned decision trees. Currently, Core ML cannot train models itself, and the only trained models available from Apple in Core ML format are for image classification. However, Core ML Tools, a Python package, can convert models from Caffe, Keras, scikit-learn, XGBoost, and LIBSVM.

— Martin Heller