Review: Nvidia’s Rapids brings Python analytics to the GPU

An end-to-end data science ecosystem, open source Rapids gives you Python dataframes, graphs, and machine learning on Nvidia GPU hardware

1 2 Page 2
Page 2 of 2

The Dask project is independent of Rapids, and includes several features that a practicing data scientist will find useful, such as low-latency, high-throughput messaging via OpenUCX (normally found in High Performance Computing applications). Dask was made for accessing high-speed block data stores through InfiniBand or large data sets in cloud storage. Dask integration really extends the reach of Rapids.

Rapids at a glance

Rapids addresses one of the biggest challenges of machine learning with Python — slow execution — and does so in an elegant way, by making existing code run on the GPU nearly unchanged. It’s hard to fault the vision or enthusiasm of Nvidia or the community.

While this project has a great deal of momentum, I can’t recommend it for enterprise work at the moment. There are simply too many changes going on under the hood as it evolves. My recommendation for enterprises is “watch this space,” wait for things to settle down, and, if possible, get a few developers involved in contributing. Especially needed are work on making the underlying C++ libraries, especially cuDF, useful for other languages, like R, Java, and Go.

Cost: Free open source licensed under Apache 2.0. 

Platform: Requires Nvidia Pascal or better GPU with compute capability 6.0 or better and recent CUDA and Nvidia drivers. Supports Ubuntu 16.04, Ubuntu 18.04, CentOS 7, and Docker CE 19.03 or better with nvidia-container-toolkit. 

At a Glance
  • Rapids addresses one of the biggest challenges of machine learning with Python — slow execution — in an elegant way, by making existing code run on the GPU nearly unchanged.

    Pros

    • Fast-paced, open source development
    • Academic involvement brings in the latest algorithmic advances (cuGraph)
    • Reduced data movement speeds up the model building cycle
    • Strong community

    Cons

    • Linux only
    • Tied to Nvidia hardware
    • Lack of C bindings limits third-party language access
    • Documentation sparse and often out of date

Copyright © 2020 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2