Will data gravity favor the cloud or the edge?

An industry standard confidential computing framework could unlock secure data processing at both the center and the edge

Will data gravity favor the cloud or the edge?

Applications are permeating the online economy. However, it’s not entirely clear whether their deployment is moving in a mainly centripetal or centrifugal direction—that is, whether they are gravitating to the cloud center or moving outward to the edge.

Applications—in the form of software, services, workloads, and business logic—tend to move in rough alignment with the data that they generate and consume. Depending on whom you ask, apps are either flocking en masse into the core of the cloud—attracted by the growing volume of data lakes—or scattering as minute microservices out to the edges, following the spread of mobile, embedded, and Internet-of-things devices.

Is data gravity real?

We live in an increasingly cloud-to-edge world, so the trend could easily go in either direction. Some observers cite the nouveau notion of “data gravity” to support whatever directional shift they’re seeing in the deployment of online applications.

In explaining the supposedly gravitational attraction of data, most observers seem to be assuming the following core principles:

  • Performance: As apps move closer to data, their latency drops and throughput increases when engaging in data-centric functions.
  • Control: As apps move closer to data, they can apply more comprehensive and fine-grained security and governance controls over the usage and processing of that data.
  • Cost: As apps move closer to data, they can benefit from greater efficiencies associated with hosting and managing the corresponding data-centric application workloads.
  • Scalability: As apps move closer to data, they can derive more data-centric value, owing to the cumulative impact of being able to easily access a higher volume and variety of data where that data is being persisted.
  • Functionality: As apps move closer to data, they can more thoroughly take advantage of the specialized data analysis, integration, manipulation, and other functionality provided by the underlying source or repository.

Data gravity and hyperconverged infrastructure

If data gravity is real, we should expect to see its influence in the architecture of cloud-to-edge environments. However, it’s not at all clear that data gravity has had any net effect in that regard.

Some have pointed to hyperconverged infrastructure as a hardware enabler for data gravity at the center. According to this argument, data’s gravitational attraction has driven the tight coupling of data storage with application processing resources—compute, memory, networking, and virtualization—within a new generation of commodity hardware solutions in cloud data centers.

However, pointing to hyperconverged infrastructure as if it were an argument for cloud-centric data gravity ignores the fact that many such hardware boxes are deployed in edge environments, not just massively racked and stacked in cloud data centers. The gravitational pull from the edge comes more from the need for continuous optimized user experiences—mobile, interactive, real-time, streaming—than from any magically attractive mass in the data stored at those nodes.

Also, the “data gravity” argument for hyperconverged infrastructure could easily be flipped around. Compute power, which is the runtime foundation of all apps, tends to pull other resources—including data—in its direction. As those resources adhere in hyperconverged chassis alongside CPUs and other processors, concomitant improvements in application performance, control, cost, scale, and functionality could just as easily pull data to the edge as bind it to the cloud.

Data gravity and confidential computing

Lacking the ability to isolate and protect sensitive data in use, many organizations simply choose not to move this data outside their networks. Data gravity might shift more readily to the edges if the data can be protected in use through an approach that’s standardized across all platforms, applications, and tools.

To realize this vision, one key enabler would be post-perimeter security, under which authentication, permission, confidentiality, and other controls persistently follow the data to wherever it happens to live. Relevant processing nodes would always have access to the relevant security assets needed to unlock access to managed data resources in use, at rest, or in motion.

Another essential element would be confidential computing hardware that implements post-perimeter data security through trusted execution environments embedded at each node from cloud to edge. The privacy benefits of standardized confidential computing hardware accelerators are obvious. They are well-suited for embedding device-level password managers and key managers, blockchain and e-banking wallets, AI and machine learning applications, messaging apps, and any other programs that handle sensitive data. But they would necessitate significant changes in how development tools construct applications in order to take advantage of such protections.

This new paradigm represents a fundamental shift in how computation is done at the hardware level. It allows encrypted data to be processed in memory on a given node without exposing it to any unauthorized software programs or other local resources. As implemented in environments such as AMD’s Secure Encrypted VirtualizationIntel’s Software Guard Extensions, Red Hat’s Enarx, and Google’s Asylo Project, confidential computing technology would isolate sensitive data payloads while they’re in use in memory in applications.

If an industry standard, hardware-enabled confidential computing framework gains traction in the marketplace, it could greatly reduce gravitational pull of on-prem systems on sensitive enterprise data. The Linux Foundation’s recent launch of the Confidential Computing Consortium is the right step in this direction. The group—which includes Alibaba, Arm, Baidu, Google, IBM, Intel, Microsoft, and others as core sponsors and participants—is developing a common, cross industry, open source framework for building persistent, in-memory, in-use security features into any application and ensure that they can be executed stringently at any node.

Data at zero gravity

To fully realize the promise of confidential computing, an industry standard framework would need to integrate into a broader post-perimeter infrastructure. In the ideal environment, data security and governance controls would be enforced consistently anywhere the data resides, from the cloud core to the myriad edges. These controls would be executed efficiently and scalably under any scenario, including data in use, data in storage, and data in flight.

Would the ideal confidential computing infrastructure move data’s “gravity” from the core to the edge? Not necessarily. If it were a universal, standard, consistently high-performance infrastructure that supports all nodes in a distributed fabric, it should have no net effect on the distribution of data or the apps that feed on it.

Copyright © 2019 IDG Communications, Inc.