The role of the database in edge computing

Edge computing is about distributing data storage and processing. A next-gen, edge-ready database is key to keeping data consistent and in sync across the cloud, edge, and client tiers.

The role of the database in edge computing
amgun / Shutterstock

The concept of edge computing is simple. It’s about bringing compute and storage capabilities to the edge, to be in close proximity to devices, applications, and users that generate and consume the data. Mirroring the rapid growth of 5G infrastructure, the demand for edge computing will continue to accelerate in the present era of hyperconnectivity.

Everywhere you look, the demand for low-latency experiences continues to rise, propelled by technologies including IoT, AI/ML, and AR/VR/MR. While reducing latency, bandwidth costs, and network resiliency are key drivers, another understated but equally important reason is adherence to data privacy and governance policies, which prohibit the transfer of sensitive data to central cloud servers for processing.

Instead of relying on distant cloud data centers, edge computing architecture optimizes bandwidth usage and reduces round-trip latency costs by processing data at the edge, ensuring that end users have a positive experience with applications that are always fast and always available.

Forecasts predict that the global edge computing market will become an $18B space in just four years, expanding rapidly from what was a $4B market in 2020. Spurred by digital transformation initiatives and the proliferation of IoT devices (more than 15 billion will connect to enterprise infrastructure by 2029, according to Gartner), innovation at the edge will capture the imagination, and budgets, of enterprises.

Hence it is important for enterprises to understand the current state of edge computing, where it’s headed, and how to come up with an edge strategy that is future-proof.

Simplifying management of distributed architectures

Early edge computing deployments were custom hybrid clouds with applications and databases running on on-prem servers backed by a cloud back end. Typically, a rudimentary batch file transfer system was responsible for transferring data between the cloud and the on-prem servers.

In addition to the capital costs (CapEx), the operational costs (OpEx) of managing these distributed on-prem server installations at scale can be daunting. With the batch file transfer system, edge apps and services could potentially be running off of stale data. And then there are cases where hosting a server rack on-prem is not practical (due to space, power, or cooling limitations in off-shore oil rigs, construction sites, or even airplanes).

To alleviate the OpEx and CapEx concerns, the next generation of edge computing deployments should take advantage of the managed infrastructure-at-the edge offerings from cloud providers. AWS Outposts, AWS Local Zones, Azure Private MEC, and Google Distributed Cloud, to name the leading examples, can significantly reduce operational overhead of managing distributed servers. These cloud-edge locations can host storage and compute on behalf of multiple on-prem locations, reducing infrastructure costs while still providing low-latency access to data. In addition, edge computing deployments can harness the high bandwidth and ultra-low latency capabilities of 5G access networks with managed private 5G networks, with offerings like AWS Wavelength.

Because edge computing is all about distributing data storage and processing, every edge strategy must consider the data platform. You will need to determine whether and how your database can fit the needs of your distributed architecture.

Future-proofing edge strategies with an edge-ready database

In a distributed architecture, data storage and processing can occur in multiple tiers: at the central cloud data centers, at cloud-edge locations, and at the client/device tier. In the latter case, the device could be a mobile phone, a desktop system, or custom-embedded hardware. From cloud to client, each tier provides higher guarantees of service availability and responsiveness over the previous tier. Co-locating the database with the application on the device would guarantee the highest level of availability and responsiveness, with no reliance on network connectivity.

A key aspect of distributed databases is the ability to keep the data consistent and in sync across these various tiers, subject to network availability. Data sync is not about bulk transfer or duplication of data across these distributed islands. It is the ability to transfer only the relevant subset of data at scale, in a manner that is resilient to network disruptions. For example, in retail, only store-specific data may need to be transferred downstream to store locations. Or, in healthcare, only aggregated (and anonymized) patient data may need to be sent upstream from hospital data centers.

Challenges of data governance are exacerbated in a distributed environment and must be a key consideration in an edge strategy. For instance, the data platform should be able to facilitate implementation of data retention policies down to the device level.

Edge computing at PepsiCo and BackpackEMR

For many enterprises, a distributed database and data sync solution is foundational to a successful edge computing solution.

Consider PepsiCo, a Fortune 50 conglomerate with employees all over the world, some of whom operate in environments where internet connectivity is not always available. Its sales reps needed an offline-ready solution to do their jobs properly and more efficiently. PepsiCo’s solution leveraged an offline-first database that was embedded within the apps that their sales reps must use in the field, regardless of internet connectivity. Whenever an internet connection becomes available, all data is automatically synchronized across the organization’s edge infrastructure, ensuring data integrity so that applications meet the requirements for stringent governance and security.

Healthcare company BackpackEMR provides software solutions for mobile clinics in rural, underserved communities across the globe. Oftentimes, these remote locations have little or no internet access, impacting their ability to use traditional cloud-based services. BackpackEMR’s solution uses an embedded database within their patient-care apps with peer-to-peer data sync capabilities that BackpackEMR teams leverage to share patient data across devices in real time, even with no internet connection.

By 2023, IDC predicts that 50% of new enterprise IT infrastructure deployed will be at the edge, rather than corporate data centers, and that by 2024, the number of apps at the edge will increase 800%. As enterprises rationalize their next-gen application workloads, it is imperative to consider edge computing to augment cloud computing strategies.

Priya Rajagopal is the director of product management at Couchbase, provider of a leading modern database for enterprise applications that 30% of the Fortune 100 depend on. With over 20 years of experience in building software solutions, Priya is a co-inventor on 22 technology patents.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2023 IDG Communications, Inc.

How to choose a low-code development platform