AWS vs. open source: DocumentDB is the latest battlefront

In trying to prevent competition from AWS, vendors like MongoDB are undercutting open source. Ultimately, the battle could hurt both sides—and IT

AWS vs. open source: DocumentDB is the latest battlefront
Thinkstock

You’ve probably missed it, but there’s a religious war being fought on Twitter. (No, really!) On one side is an array of data-infrastructure companies (MongoDB, Confluent, and Redis Labs) that claim that Amazon Web Services is strip-mining their open source code to sell cloud services like Amazon Aurora, RDS and MSK (Managed Services for Kafka. On the other side is AWS, whose CEO Andy Jassy insists that its products aren’t intended as “a shot across the bow of anyone. If you look at what we are doing, it's very much informed by customers.”

Which leads us to today’s news that AWS is launching Amazon DocumentDB, a MongoDB-compatible database that was “designed from the ground up to give customers the performance, scalability, and availability they need when operating mission-critical MongoDB workloads at scale.” Shawn Bice, vice president of nonrelational databases at AWS, stressed to me that while “customers like MongoDB’s flexible data model and other attributes, they struggle to get the performance and availability from it that they require.”

For customers, this basically means they can get their MongoDB from AWS as a cloud service. But what it means for AWS, and for MongoDB, is much more complex. (I should point out that I had been vice president of community at MongoDB some years ago.)

AWS: Have your MongoDB cake and eat it too

Although it has fallen a bit in popularity over the last few years relative to PostgreSQL, according to DB-Engines data, MongoDB remains the fifth-most popular database in the world. Given the ease with which developers can quickly become productive with MongoDB, this popularity isn’t surprising.

But one problem with that ease is that it’s also easy to go wrong with MongoDB. That’s a feature, not a bug, but it does mean that today’s quickly-hacked-together application can become tomorrow’s support and maintenance nightmare for a company’s operations team.

In identifying the reasons for its DocumentDB service, AWS noted, “Customers also find it challenging to build performant, highly available applications on MongoDB that can quickly scale to multiple terabytes and hundreds of thousands of reads and writes per second because of the complexity that comes with setting up and managing MongoDB clusters.”

This isn’t untrue, per se, but it’s also not the whole story.

After all, MongoDB has been working to make it easier to operate the database at scale with its Atlas service. Atlas has been a hit, growing 300 percent annually and now representing 22 percent of the company’s $260 million in annual revenue. This is great, but it’s also where the religious war gets started. Not content with competing on ease of use, MongoDB went one step further to try to monopolize making MongoDB operationally efficient by introducing its Server Side Public License in October 2018 to block anyone else (but primarily AWS) from selling a competing service.

Judging from today’s release of Amazon DocumentDB, it didn’t work.

Well, Amazon DocumentDB isn’t quite the same as MongoDB

As stated in the DocumentDB press release, “Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server, allowing customers to use their existing MongoDB drivers and tools with Amazon DocumentDB.” I asked AWS’s Bice to translate this into normal speak, to which he responded, “The vast majority of a customer’s MongoDB applications can be used with Amazon DocumentDB with little to no changes.”

When asked “But how will a customer know that their MongoDB application will work with Amazon DocumentDB?” Bice said it’s the top question on every enterprise’s mind. At launch, AWS will offer the most popular MongoDB services. Then, using its built-in API monitoring tool, it will track the most sought-after features to make the service better. Customers can read the documentation to see which services are offered; they can point their application at DocumentDB to see what will work from an API programmability standpoint; or they can use the AWS Database Migration Service to migrate an application to test whether it works. Again, Bice said, “If something isn’t supported by DocumentDB we pick that up in our telemetry and update the product accordingly.”

This means that, for most applications, most customers can get their MongoDB from AWS, without AWS running any MongoDB code on its servers, while pairing DocumentDB with all the other AWS services they use. As Sunjay Pandey, vice president at Capital One, noted, “Amazon DocumentDB integrates deeply with AWS services and provides us with a robust, highly scalable, and cost-effective database service that meets our operational requirements.”

This raises the ante for MongoDB’s Atlas. But what does it mean for the underlying open source project?

The larger tragedy of the open source wars with AWS

Think about it. As MongoDB said when announcing the SSPL, “We have invested approximately $300 million in R&D over the past decade to offer a modern, general-purpose, open source database for everyone.” The concern, said MongoDB CEO Dev Ittycheria, is that unless MongoDB can monetize the code exclusive of cloud competition, it won’t have enough money to sustain its development in the database.

This may seem persuasive, but it’s not necessarily true. For example, the open source project Magento changed how it worked with its developer community a few years back, and it now sees 60 percent of its code— including some of its hardest-to-develop features—come from third-party developers not employed by the company. Companies that figure out how to work with community can actually see innovation accelerate, not decline, regardless of the cash they’re able to charge for the product.

Even if you concede Ittycheria’s point, you then need to ask: What benefit does AWS get if it inadvertently kills the projects from which it’s building services?

I asked Bice what would happen if AWS succeeds wildly with millions of customers looking to it, rather than MongoDB (or Confluent or Redis Labs), to manage these open source projects as services. After all, for DocumentDB to provide a long-lasting, useful service, it needs MongoDB—not merely DocumentDB—to survive and thrive. It needs, in short, to contribute.

Bice gave a multipart response.

First, he said, Amazon’s “pace of contributions to open source continues to accelerate.” While AWS can certainly improve, the company has invested in a variety of projects for many years, including Xen (the basis for EC2), Linux, Kubernetes, the Robot Operating System, a variety of Apache projects (Lucene, Hadoop, Spark, etc.), and many more.

Nor have the company’s contributions been reserved for community-driven open source projects. In early 2018, for example, AWS open-sourced Encryption in Transit for Redis, a way to “secure real-time applications and encrypt all communications between clients and Redis servers.” This example is particularly interesting because, as AWS open source chief Adrian Cockcroft points out, Redis Labs  “relicensed its code to make sure we wouldn’t contribute to it.” These companies say they want AWS to contribute, but when the cloud giant does, they block it.

But they shouldn’t. This Redis contribution calls out a key benefit of AWS: It operates at extreme availability and performance. The company arguably is in the best position to stress-test open source code. As such, Bice not surprisingly said, many of its contributions relate to this area. Of course, it’s impossible to open-source all the elements that go into operational excellence, but this area is ripe for greater collaboration between AWS and the open source communities from which it derives benefits.

Which brings me to Bice’s third point: When AWS launches a service based on an open source project, “we are absolutely committing to that project for the long term.” This makes intuitive sense, given how customer-obsessed the company is. AWS isn’t looking to maintain a proprietary fork of Kubernetes or Kafka, for example, because “maintaining an internal, forked version of a project creates wasted effort. We know that’s not a good thing, as it slows innovation down.” As such, he said, “If there are not barriers put in front of us, we’ll be contributing feature enhancements, bug fixes, etc. to the open source projects we use because we want those projects to thrive.”

Not just limp along. To thrive.

The way forward: Greater mutual commitment to open source

In multiple, separate conversations with different people at AWS, I heard the same thing: “We don’t have a one-winner-takes-all view of the world.” The focus is on customers and meeting their needs.

Arguably, this is one of the biggest problems: AWS is simply better at operationalizing code that many of the commercial developers of open source projects. Those companies want an exclusive right to monetize open source (which, ironically, is not very open), but their customers just want to be able to run the open source code at scale to solve real business problems, spending less time futzing with patches, upgrades, etc.

According to Bice, “We invest in these open source communities. We think we can bring a lot of customers to any API that we’re supporting.” (And, yes, as MySQL database guru Mark Callaghan points out, AWS can and should do more.) This support isn’t simply engineering, but also marketing. Yes, this means that we may well see AWS trying to get more people interested in MongoDB and Amazon DocumentDB. The more AWS can grow the number of users around MongoDB, the more may end up wanting to use DocumentDB to run at scale.

“Open source collaboration across companies and academic institutions has produced some of the world’s most interesting breakthroughs,” Bice insisted. That’s something AWS wants to foster, not kill.

Much of the tension between AWS and the commercial providers of open source projects comes down to who should be able to serve customer needs. One area that should smooth over some of the disquiet is that both AWS and these providers need the open source projects to succeed.

Copyright © 2019 IDG Communications, Inc.