MapD database extends GPU power to AWS, Google Cloud

Version 2 has faster queries and a new web-based visualization tool, with plans to spread to Microsoft Azure too

MapD database extends GPU power to AWS, Google Cloud

MapD, creator of a GPU-powered database system that it claims is “anywhere from 75 to 3,500 times faster than traditional CPU-bound databases,” has released a new version, with a web-powered analytics front end and editions for several major cloud environments.

Version 2.0's collection of features reads like a roster for what will likely become default for next-generation databases: GPU-accelerated analytics; interactive data that requires nothing more than a browser to work with; and an edition that takes advantage of GPUs available at scale in the cloud.

Faster on the inside, faster on the outside

The first version of MapD demonstrated that GPUs could be used to accelerate database queries—not only by using GPU-based algorithms for number-crunching, like many popular machine learning frameworks. Instead, it uses the memory of up to eight GPU cards on a server as a fast and highly localized data cache, and it compiles the queries to native machine code to run at top speed on CPUs and GPUs alike.

MapD claims the latest version expands on the original idea by adding better-performing versions of common SQL statements, such as faster GROUP BY statements and more performant full-text search, as well as by improving data connectivity functions like JDBC or integration with Hadoop.

Another feature, MapD Immerse, is a web-based data visualization tool in the same vein as Tableau or Qlik Sense. Nothing will prevent MapD data exploration through third-party tools such as an ODBC connector or Apache Thrift; in fact, the company lists that as a selling point. However, MapD claims Immerse stands apart from third-party solutions because it’s built to take advantage of MapD’s parallelism.

“We can give the user concurrent, multidimensional views into their data because of the high speed available to us via the GPU,” MapD said. This allows “for maximum emphasis on instant data exploration, without the need to tiptoe around computing resources.” (One example: a live exploration of New York City taxi ride data.)

MapD is also pushing the web-based nature of Immerse as a boon for developers, since they can use the product’s existing JavaScript APIs to develop their own web-based charting and dashboarding tools, as opposed to relying on a third-party visualization solution.

Faster up there, too

MapD has also expanded how its product is deployed. The previous release was available as on-prem software or a hardware appliance and through Softlayer’s (now IBM’s) bare-metal cloud, whereas Version 2.0 is also available through Amazon Web Services and Google Cloud. MapD’s team says plans are in the works to expand to Azure as well.

Running MapD on any of those services would have been possible, but nowhere nearly as effective, without their respective GPU-enabled instances. AWS has had GPU-powered instances for some time now, but recently revamped them in a way that echoes Google’s newly unveiled GPU offerings. GPUs can be attached or detached from instances rather than be coupled exclusively to a given instance.

MapD is hardly alone in the GPU-powered database world; it has many cohorts and competitors, both open and closed source. What’s buoying them all is that GPU development continues to outpace (and stimulate) CPU development, and the biggest proving ground for GPU-powered applications will be a cloud that’s becoming more receptive to them. 

The risk MapD faces is in how well—and how quickly—the rest of the database world emulates its example and offers the same features to their customers.