Startup MapD Technologies has released a new database and analytics platform that claims to be orders of magnitude faster than competing solutions, due to its use of GPU processing to accelerate queries.
Founded by Todd Mostak, a former researcher MIT Computer Science and Artificial Intelligence Laboratory, MapD executes queries in parallel on up to eight GPU cards per physical server. The video memory of each GPU is used as a high-speed data cache.
The company claims a single MapD node can return results for "five common analytical queries," from a 1.2-billion-row data set in less than a second, versus multisecond timings for two competing (albeit unnamed) in-memory databases and a 10-node Hadoop OLAP cluster.
Most of this speed is credited to the high memory bandwidth available to the GPUs: "A server with 8 Nvidia K80s [typically with 24GB of memory] has nearly 4TB/sec of bandwidth versus perhaps 100GB/sec over two Xeon CPUs," said Mostak in an email. GPU memory is also used as a L1 cache for data, so it doesn't need to move across the PCI bus.
SQL queries are compiled to native GPU code via the LLVM compiler framework, but can also be compiled and run on each node's CPUs if needed. The latter can operate as a fallback if the data set for a query doesn't fit in GPU memory.
Mostak noted another advantage of using GPUs in this fashion: It allows performance to grow with breakthroughs in GPU technology and GPU memory speed, which can happen apart from improvements in CPU and CPU-to-memory speeds.
Mostak said, "[4TB/sec] GPU bandwidth should double to triple with the High Bandwidth Memory (HBM) that will be used on Nvidia’s new Pascal architecture, likely to be launched next week at Nvidia’s GPU Technology Conference." (Nvidia is an investor in MapD, along with Vanedge Capital and Google Ventures.)
MapD plans to offer its database in three incarnations: via Softlayer's bare-metal cloud, with two Nvidia Kepler-series K80 GPUs; as a Supermicro appliance with anywhere from four to eight K70s; and as software to be run on compatible hardware provided by the customer.
K80s aren't cheap; a single unit sports a list price of around $4,300. But MapD hopes the raw query speed provided by GPU acceleration will prove valuable and enticing, especially as costs fall, bandwidths rise, and GPUs accelerate at rates that outpace their CPU counterparts.