Those “evil” cloud companies aren’t sucking the life out of open source—they’re the major contributors

By some accounts, the open source world is about to end as evil cloud empires suck the marrow from fragile open source communities, giving little back. This narrative has taken hold, leading some doomsday prophets to preach the end of open source sustainability as we know it.

The data, however, suggests something very, very different.

According to two independent analyses of GitHub data, as well as CNCF data, the biggest contributors to open source projects are—you guessed it!—the public cloud companies. Indeed, precisely because they’re in the business of operationalizing software, not selling it, these companies are perhaps best positioned to fuel, not destroy, open source for many years to come.

Open-sourcing the forest, not just some trees

For those paying attention, it has been clear for some time that Microsoft and Google, in particular, have been the biggest, most public contributors to open source projects. As dominant platform companies intent on reaching developers, open source is a requirement, not a nice-to-have. Microsoft initially made waves by opening up to running and/or supporting all sorts of open source projects on Azure, while Google went a step further and open-sourced incredibly powerful code like Kubernetesand TensorFlow.

Even Amazon Web Services, the hegemonic cloud leader accused of skimping on open source contributions, can no longer sit on the sidelines of open source communities. While AWS has always been more active in open sourcethan supposed, it dramatically upped its open source game in 2018.

All of which is captured in Adobe developer Fil Maj’s analysis of more than 6.2 million GitHub profiles and their contribution histories. Caveat: This is an inexact science, and of course leaves out significant code repositories (such as Apache projects). Even so, there’s plenty of signal in this analysis of user-company affiliations (self-reported on their profiles in the company field), and that signal says “the cloud rules open source.” The table shows his data.

IBM’s ranking is helped considerably by its acquisition of Red Hat. While the deal has yet to close, the table portrays the combined entity. Split the two and Google jumps into the second spot, Red Hat falls to a strong No. 3, and IBM takes a distant fourth. (Keep in mind, however, that companies like IBM may be more active with Apache projects, which aren’t represented in Maj’s tally.)

Felipe Hoffa takes a different approach to the GitHub data set, and here Microsoft’s and Google’s lead becomes even more apparent: In 2018, both had about 1,000 GitHub participants and have contributed to about 1,000 repos each. Red Hat comes in third at about 500 repos contributed to and 600 GitHub participants, with Amazon, IBM, Pivotal, and Intel following, all clustered around 400 of each. Microsoft, Google, Red Hat, Pivotal, and IBM were nearly as active in 2017 as they were in 2018, but Amazon about tripled its GitHub participants and more than doubled its contributed-to repo from 2017 to 2018.

Again, the data isn’t perfect, but it is still be hard to avoid the conclusion that the biggest, most-active contributors to open source today are cloud companies. More broadly, using Maj’s data set, it’s interesting (but not surprising) that seven of the Top 10 largest open source contributors aren’t in the business of selling software: They sell services.

Why the cloud companies can afford to be so generous

Catch that? I’ll repeat it: The biggest contributors to open source software are not software companies per se. They’re cloud companies or companies otherwise not in the business of peddling software. Why does this matter? Because the companies that have struggled most to engage freely in open source communities have been those whose business models require them to lock down code. For companies whose business is hardware, cloud services, or something other than software, active contribution to open source can create more complements to core business value.

The big cloud companies increasingly see this, but one other takeaway from the Maj and Hoffa analyses is the dearth of nontechnology enterprises on the list. If “software is eating the world” and “developers are the new kingmakers,” as Silicon Valley pundits love to say, enterprises from industries as varied as financial services and retail should be active contributors to open source.

The problem, as HSBC chief architect David Knott told Mitch Wagner, is that “we haven't figured out yet … what we might expose ourselves to if we're making contributions. From an engineering perspective, we think it's the right thing to do and the responsible thing to do. But we need to understand it from a legal perspective.” Put another way, mainstream enterprises are a decade behind their more tech-heavy counterparts, which have been grappling with open source for nearly two decades in a participatory manner. These other enterprises will learn how and why to contribute over time, but they’re farther back on the learning curve than the tech companies.

All of which may mean we should spend a lot less time wringing our hands, blaming the cloud companies for putting open source sustainability at risk, and instead acknowledge the need to train a new generation of contributors. This new breed won’t have the constraints of a software license business model to overcome. Instead, they just need welcoming communities to train them.

