5 machine learning tools to ease software development

AI-driven development tools that provide code auto-completion, code vulnerability detection, and even cutting-edge code generation

5 machine learning tools to ease software development
Table of Contents
Show More

Most discussions of developers making use of machine learning revolve around creating AI-powered applications and the tools used to create them: TensorFlow, PyTorch, Scikit-learn, and so on.

But there is another way machine learning is impacting software development: by way of new development tools that use machine learning techniques to make programming easier and more productive. Here are five projects—three commercial, two experimental—that put machine learning to work for developers within the development process.


Kite is a code completion tool, available for most major code editors, that uses machine learning techniques to fill in your code as you’re typing it.

The machine learning model used by Kite is created by taking publicly available code on GitHub, deriving an abstract syntax tree from it, and using that as a basis for the model. According to Kite, this allows auto-suggestion and auto-completion to be derived from the context and intention of the code, rather than just the text.

Right now Kite is only available for Python developers, but Go support is in the works. And while Kite was originally available only for Windows and MacOS users, now it supports Linux too.

In 2017, Kite raised concerns in the open source community with its handling of user data and its modification of the autocomplete-python package for Atom. The company has addressed both concerns, claiming that Kite no longer sends user code back to its cloud servers, instead performing all processing locally, and explicitly acknowledging that the autocomplete-python package is a Kite-sponsored package. 


Codota is outwardly similar to Kite. It uses a machine learning model, trained on Java and Kotlin code, to suggest autocompletions for these languages as you type. Like Kite, Codota uses the code’s syntax tree, not merely its text, as the data for building its models.

Unlike the revamped Kite, Codota uses a cloud-based service to generate and serve predictions. However, according to the service’s documentation, Codota does not send user code to the Codota server, but only “minimal contextual information from the currently edited file that allows us to make predictions based on the current local scope.”

Codota is available for Windows, MacOS, and Linux, but editor support is limited to IntelliJ, Android Studio, and Eclipse (Luna or later). This makes sense given its focus on Java and Kotlin languages. The company notes that support for other languages is in the works, with JavaScript among the first in line. (Beta support for WebStorm, the JetBrains JavaScript IDE, is available now.)

The free version of Codota uses predictions created from freely available code. The enterprise version (pricing on request) can use private code repositories for training.


DeepCode performs automated, AI-guided reviews of code to detect potential security vulnerabilities. Like Kite and Codota, DeepCode analyzes code available in public repositories to look for common patterns. But DeepCode uses those patterns to identify security holes. 

DeepCode focuses on “taint analysis,” which determines how user input is handled before it reaches any security-critical point. Data that goes directly from user input to, say, a SQL query, without verifying that it’s safe to pass along, is considered “tainted” and raises an alert.

The critical bugs DeepCode claims to flag include common security issues found in web applications: cross-site scripting, SQL injection attacks, remote code execution, and path-traversal attacks.

DeepCode’s analyses are available for GitHub and Bitbucket repositories, and they cost nothing for open source projects or private projects with up to 30 developers. DeepCode is also available for scans of on-premises code hosting (e.g. GitHub Enterprise), with pricing available only on request.

Microsoft PROSE

PROSE is an acronym for “PROgram Synthesis using Examples.” This Microsoft project is an SDK for generating code from sample input and output. Thus PROSE is a toolkit that could be used to build predictive coding tools, rather than a predictive coding tool itself.

Potential PROSE applications include transforming text by example (one implementation of this is Microsoft’s own “Flash Fill” function in Excel), extracting data from text files (e.g., log analytics), and predictive file manipulation (e.g., splitting text into columns by example).


The premise of Pix2code sounds like science fiction. Feed it a screenshot of a graphical user interface, and Pix2code will generate code that renders that GUI. Pix2code uses a deep learning model, trained on a data set provided with the software, to produce GUIs in Android XML, iOS Storyboard, and HTML/CSS formats. Pix2code is an experimental research project (“shared for educational purposes only”), so any work done with it would need to use the project as a base for further development.

Copyright © 2019 IDG Communications, Inc.