What’s new in the Anaconda distribution for Python

Anaconda 5.2 adds job scheduling, support for GPUs, and integration with version control systems including Git and GitHub

What’s new in the Anaconda distribution for Python
Thinkstock

Anaconda, the Python language distribution and work environment for scientific computing, data science, statistical analysis, and machine learning, is now available in version 5.2, with additions to both its enterprise and open-source community editions.

Where to download Anaconda 5.2

The community edition of Anaconda Distribution is available for free download directly from Anaconda’s website. The for-pay enterprise edition, with professional support, requires contacting the Anaconda (formerly Continuum Analytics) sales team.

Current version: What’s new in Anaconda 5.2

This enterprise edition of Anaconda, released this week, adds new features around job scheduling, integration with Git, and GPU acceleration.

Earlier versions of Anaconda Enterprise were built to allow professionals to leverage multiple machine learning libraries in a business context—TensorFlow, MXNet, Scikit-learn, and more. In version 5.2, Anaconda offers ways to train models on a securely shared central cluster of GPUs, so that models can be trained faster and more cost-effectively.

Also new in Anaconda Enterprise is the ability to integrate with external code repositories and continuous integration tools, such as Git, Mercurial, GitHub, and Bitbucket. A new job scheduling system allows tasks to be run at regular intervals—for instance, to retrain a model on new data.

Changes in the community version include the following:

  • Security fixes for 20 or so packages, based on CVE analyses.
  • Fixes to the Windows installer to prevent using invalid install paths or causing collisions with existing software components.
  • Better use of working directories on Windows in multi-user installation scenarios.

Previous version: What’s new in Anaconda 5.1

Anaconda 5.1, and the point fixes that followed, have mostly been minor touch-ups to both the enterprise and community editions.

Some notable changes to the enterprise edition include a new post-install setup script and GUI that ease the post-configuration needed with a new Anaconda Enterprise install (for instance, when setting up TLS certificates). You also have the ability to generate “custom Anaconda installers, parcels for Cloudera CDH, and management packs for Hortonworks HDP.” Changes to the community edition include the ability to use Microsoft Visual Studio Code as an editor option at install time.

Previous version: What’s new in Anaconda 5.0

The Linux and MacOS versions of Anaconda 5 have been built with new compilers: GCC 7.2 for Linux and Clang 4.01 for MacOS. This extends the speed benefits of those compilers to users of earlier editions of those OSes—to MacOS 10.9 Mavericks and CentOS 6.

Anaconda 5 also provides Python packages rebuilt with the new compiler, through its package-management tool conda. However, for the time being, those rebuilt packages are available through a different installation channel.

Anaconda’s long-term plan is to make that new installation channel the default, as more packages get added to the new channel and as users obtain the newly optimized packages and give them a shakedown.

Anaconda’s conda tool simplifies installing Python packages used in stats and data analysis, because many of those packages have complex binary dependencies. Conda-forge is a GitHub organization where users can share packages, build recipes, and distributions of projects built for conda.

Some 3,200 packages from Conda-forge are available in their own package list. Among some of the most recently updated:

  • cassandra-driver, a Python module for working with Apache Cassandra and its binary data-access protocol.
  • pyinstaller, for bundling a Python app as a self-contained executable.
  • plotly, an interactive graphing library.
  • openblas, a library for basic vector and matrix math.

Anaconda’s strategy moving foward is to use Conda-forge as its source for build recipes, both for consistency’s sake and to allow a broader range of third-party packages to be used in Anaconda.

Also new in Anaconda 5.0:

  • More than 100 packages available through conda have been updated or revised. One major project for accelerating computational speeds on conventional CPUs, the Intel Math Kernel Library, is now available in version 2018.0.0.
  • NumPy users can now work with a wider range of versions of that popular math and statistics package. Other packages in Anaconda’s suite may depend on different versions of NumPy, but users may want access to the latest and greater version. (Anaconda’s term for this is “dependency pinning.”)
  • R language users now have access to R version 3.4.2. All of R’s packages, including RStudio, were rebuilt using Anaconda’s new compilers.