Answering the need for speed and scalability: the state of in-memory computing

Mature, cost-effective solutions are powering today’s digital transformation and omnichannel customer experience initiatives

speed: light trails along a city street at night
Marc-Olivier Jodoin (CC0)

In-memory computing has come a long way over the last few years. Today’s mature, cost-effective solutions deliver the massive application speed and scalability organizations need to power digital transformation and omnichannel customer experience initiatives.

These initiatives may take the form of web-scale applications, social media engagement, mobile applications, in-store systems, or IoT-driven applications. According to Gartner (“Predicts 2018: In-Memory Computing Technologies Remain Pervasive as Adoption Grows,” Biscotti, 12/22/17, ID# G00341703):

  • By 2019, 75 percent of cloud-native application development will use in-memory computing (IMC) or services using IMC to enable mainstream developers to implement high-performance, massively scalable applications.
  • By 2021, at least 25 percent of large and global organizations will adopt platforms combining multiple in-memory technologies to reduce their IMC infrastructure complexity.

However, in-memory computing technologies are continuing to evolve. It is very helpful for IT decision makers to understand the maturity level of the various technologies today, the use cases they support, and what the future holds.

In-memory computing solutions include in-memory data grids, in-memory databases, event stream processing, and in-memory computing platforms that combine all three and more. According to the Gartner report “Hype Cycle for In-Memory Computing Technology, 2017” (Biscotti & Pezzini, 7/17/17, ID# G00314902), in-memory data grids are the most mature, having nearly reached the “plateau of productivity” and early mainstream adoption by July 2017. At that time, in-memory databases were in the “trough of disillusionment” and event stream processing was just passing the “peak of inflated expectations,” but both technologies are gaining in usage and users are increasingly incorporating them into their applications.

On the horizon is the adoption of memory-centric architectures to make in-memory computing more flexible and cost-effective for a wider range of organizations and industries. In-memory computing solutions are also beginning to incorporate machine learning capabilities, which will make it easier and less expensive for companies to take advantage of machine learning for applications such as fraud detection and website visitor recommendation engines.

Let’s take a quick look at all these technologies.

In-memory data grids for existing applications

An in-memory data grid, which is deployed on a cluster of on premise or cloud servers, is inserted between the data and application layers of existing applications. The in-memory data grid can leverage all of the cluster’s available memory and CPU power and can be scaled out simply by adding a new node to the cluster. The in-memory data grid retains a copy of the disk-based data from RDBMS, NoSQL, or Hadoop databases in RAM, where processing takes place without the delays of disk reads and writes. Some in-memory data grids also support ANSI-99 SQL and ACID transactions, advanced security, and Spark, Cassandra, and Hadoop native integrations. An in-memory data grid is the simplest and most cost-effective way to speed up and scale out existing architectures to support web-scale applications, IoT initiatives, or other data-intensive projects. In some use cases, an in-memory data grid allows organizations to implement hybrid transactional/analytical processing (HTAP); that is, performing analytics on the live operational data set without impacting performance—an important requirement for many digital transformation projects which require real-time data analysis to drive optimal user interactions.

As described at the In-Memory Computing Summit North America 2017, Workday, a financials and HR SaaS solution provider serving Fortune 50 companies, has approximately 1,800 customers, with about 26 million workers under management. The company is currently using an in-memory data grid to process about 189 million transactions per day, peaking at around 289 million per day. For comparison, Twitter handles approximately 500 million tweets per day.

In-memory databases for new applications

Companies typically use an in-memory database when rearchitecting existing applications or building new ones. These solutions may rely on a memory-centric architecture designed for real-time speed and scalability with the ability to trade off cost and performance. Today’s more advanced in-memory databases support data-processing APIs, including ANSI-99 SQL, key-value, compute, and machine learning. All data resides in memory, and the absence of disk reads and writes delivers 1,000 times faster performance compared to disk-based databases. An in-memory database cannot be used for an existing application without a rip-and-replace of the existing database.

Event stream processing

Taking advantage of in-memory speeds, an event stream processing engine manages all the complexity around dataflow and event processing. It makes it easy for users to query active data without impacting performance. The streaming analytics engine enables companies to quickly get answers to questions, such as “What are the 10 most popular products over the last two hours?” or “What is the average product price in a certain category for the past day?” without an ETL process to move the data into an analytics database.

With the addition of continuous learning frameworks, in-memory computing platforms can apply machine learning models to make decisions based on the incoming data. Because the machine learning and deep learning libraries are part of the in-memory computing platform, they can even be used for continuous learning in which the models are updated continuously, as new data is added to the operational data set. In this way, the system can adapt to changes in the incoming data in real-time to drive more effective decision making which is updated on a continuous basis.

The rise of the in-memory computing platform

In-memory computing platforms combine in-memory data grids, in-memory databases and stream processing including continuous learning capabilities. These comprehensive platforms provides companies with the flexibility to speed up and scale out current applications and build new applications using a complete product from a single vendor. The vendor can will support and continue to mature all the capabilities of the platform. A single platform significantly reduces development time, complexity, and costs, making it easier to take advantage of the power of in-memory computing.

With more than $1 trillion in client assets under management, Wellington Management has deployed its investment book of record (IBOR) on an in-memory computing platform. The Wellington IBOR serves as the single source of truth for investor positions, exposure, valuations and performance. All real-time trading transactions and related account and back office activity flow through the IBOR in real time. The IBOR also supports analytics for performance analysis, risk assessments, regulatory compliance and more. Wellington’s IBOR built on an in-memory computing platform has unlimited horizontal scalability, uses SQL, is a hybrid transactional/analytical processing (HTAP) system, and performs at least 10 times faster with in-memory computing than its underlying Oracle database.

If all of the above is what we’re currently seeing, what’s ahead?

Persistent store: a memory-centric architecture

One of the more interesting recent developments in in-memory computing is the introduction of memory-centric architectures based on a persistent store capability. Persistent Store is a distributed ACID and ANSI-99 SQL-compliant disk store that can be deployed on spinning disks, solid state drives (SSDs), 3D XPoint, and other storage-class memory technologies. Persistent store keeps the full data set on disk, which is fully operational, while putting only a subset of user-defined, time-sensitive data in memory. This allows organizations to adjust the amount of data kept in-memory to achieve an optimal trade-off between infrastructure costs and application performance. And because the data on disk is fully operational, there is no need to wait for all the data to be loaded into RAM in the event of a cluster restart. Persistent store will also enable organizations to take advantage of HTAP without the need to keep 100 percent of data in-memory.

Machine learning

Another important in-memory computing development is incorporating machine learning and deep learning capabilities into an in-memory computing platform. Machine learning and deep learning libraries can be optimized for massively parallel processing (MPP) against the data residing in the in-memory cluster. It therefore becomes possible to greatly accelerate large-scale machine learning and deep learning use cases by running machine learning or deep learning algorithms directly against petabyte-scale operational data sets in real-time—without the need to move data into a separate modeling database. This new architecture can also enable continuous learning that enables companies to continuously update machine learning models at in-memory speeds and with massive horizontal scalability to support fraud detection, ecommerce recommendation engines, and more.

With the mounting pressure on enterprises to launch digital transformation and omnichannel customer experience initiatives, the evolution of in-memory computing is one of the most significant trends in IT. Decision makers should track these developments closely and consider attending one of the growing number of conferences and meetups that will give them the opportunity to understand in more detail how their organizations can benefit.

This article is published as part of the IDG Contributor Network. Want to Join?