In-memory data grids vs. in-memory databases

Selecting the right option for accelerating applications can reduce complexity and save time and money

database abstract network innovation
Getty Images
Current Job Listings

The adoption of in-memory computing continues to accelerate. Mature solutions enable organizations to obtain the database processing speed and scale they require for their digital transformation and omnichannel customer experience initiatives. For example, investment firm Wellington Management used an in-memory computing platform to accelerate and scale its investment book of record (IBOR), the single source of truth for investor positions, exposure, valuations, and performance. All real-time trading transactions, all related account activity, third-party data such as market quotes, and all related back-office activity flow through its IBOR in real time. The IBOR also supports performance analysis, risk assessments, regulatory compliance, and more. In various tests, the new platform performed at least ten times faster than the company’s legacy system built directly on an Oracle relational database.

Gartner predicts that by 2019, 75 percent of cloud-native application development will use in-memory computing, or services using in-memory computing, to enable mainstream developers to implement high-performance, massively scalable applications. However, developers new to in-memory computing technologies need to develop an understanding of the different strategies for adding the technology to their architectures. In most cases, the first decision they need to make is whether to deploy an in-memory data grid or an in-memory database. This decision will be based primarily on whether they intend to accelerate existing applications, plan to develop new applications or completely rearchitect existing ones or see an opportunity to do both. They also need to consider which layer will serve as the system of record, the in-memory computing layer or the underlying data layer.

Let’s explore the in-memory computing technologies needed to implement these strategies.

In-memory data grids

An in-memory data grid (IMDG) copies disk-based data from RDBMS, NoSQL, or Hadoop databases into RAM, where processing takes place without the delays caused by continual disk reads and writes. Inserted between the application and data layers, the in-memory data grid is deployed on a cluster of server nodes and shares the available memory and CPU of the cluster. Whether deployed in a public or private cloud environment, on-premises, or in a hybrid environment, an in-memory data grid can be scaled simply by adding a new node to the cluster. Some in-memory data grids can support ANSI-99 SQL and ACID transactions, advanced security, machine learning, and Spark, Cassandra, and Hadoop native integrations.

An in-memory data grid is a simple and cost-effective solution for existing applications. However, many in-memory data grids require that all the data in the underlying disk-based database fit into memory, requiring a business to purchase enough memory to hold all the data. Since memory is still more expensive than disk, many companies may prefer to keep some data only on disk. New memory-centric architectures solve this by processing against the full dataset, even if some of the data is stored on disk. This “persistent store” capability allows the amount of data to exceed the amount of memory. This means data can be optimized so all the data resides on disk, but more frequently used data also resides in-memory, while infrequently used data resides only on disk. Another key advantage is that following a reboot, a system with a persistent store can begin processing immediately against the dataset on disk without waiting for the dataset to load into memory.

Workday, a financials and HR SaaS solution provider serving Fortune 50 companies, related how it uses an in-memory data grid to process about 189 million transactions per day, peaking at around 289 million per day. For comparison, Twitter handles approximately 500 million tweets per day.

In-memory database

An in-memory database (IMDB) is best suited for new or re-architected applications. It is a full-featured, standalone database running in-memory that supports data processing APIs such as ANSI-99 SQL, key-value, compute, and machine learning. The advantage of an in-memory database over an in-memory data grid is that the architecture is reduced from three layers (application, in-memory, and data) to two. The disadvantage is that it cannot be used for an existing application without a lift and shift of the data set from the existing database. Furthermore, because an in-memory database serves as the system of record, the solution must include a strategy for protecting the data in the event of downtime. This strategy may be similar to the persistent store capability discussed for in-memory data grids, or it could involve the use of nonvolatile RAM, a new technology that will likely play an increasingly prominent role in the future.

Today, a major bank with 135 million customers is using an in-memory database with a persistent store capability to develop a web-scale architecture that can handle up to 1.5PB of data, along with the required transaction volume. This solution serves as the system of record and does not sit atop an existing datastore.

In-memory computing platforms

Organizations developing a long-term strategy that involves accelerating existing applications and rolling out new ones may opt for an in-memory computing platform that combines the scalability of an IMDG with the full relational database capabilities of an IMDB. The in-memory computing platform, therefore, can be used to accelerate existing applications or be the basis for the creation of new or rearchitected applications that can take advantage of distributed computing and a persistent store.

In addition to deciding which technology best meets their needs, organizations should consider whether they require additional supporting in-memory technologies, such as:

  • A streaming analytics engine to manage all the complexity around dataflow and event processing.
  • A deep-learning-powered continuous-learning framework to serve as a building block for what Gartner refers to as in-process HTAP (hybrid transactional/analytical processing); that is, the ability to apply machine learning or deep learning analysis to operational data in real time.

In-memory computing technology is used by leading digital enterprises now and will become even more widely used in the future. The sooner you develop a solid understanding of the deployment strategies and capabilities of in-memory computing, the sooner you will be able to help your organization gain the competitive advantage it needs.

This article is published as part of the IDG Contributor Network. Want to Join?