Open source lives! The R project is the real deal

The community around the R language is the real deal -- not just another feel-good open source contrivance to enable big companies to collaborate on code

Once open source became a big deal it was just a matter of time until it became big business. When we crossed that line, it was foreordained that defending the ideals that make open source great would be a persistent challenge.

It's therefore not surprising to see so much of today's prominent open source projects seemingly bought and sold by corporations.

Oh, sure, they're shrouded by the comforting guise of foundations -- but look at who controls those foundations. It's hard to remember that open source only really thrives when developers reign.

That's what makes R Project so refreshing. R, the statistical programming language and environment that boasts millions of community members, has the most community-centered board I've ever seen. While many foundations are simply corporate vanity projects loaded with the highest bidders, R's board is filled with data scientists affiliated with universities, not corporations.

In this way, R offers the industry an example of how to do open source right. Importantly, R's imperviousness to corporate influence won't be dampened by yesterday's announcement of the new R Consortium.

A very different sort of foundation

Look at the board of directors for Cloud Foundry, as just one emblematic example of today's open source foundations. The foundation trumpets the corporations that comprise its membership, and its chair comes from EMC, the company that started Cloud Foundry. Other board members hail from IBM, SAP, VMware (affiliated with EMC), HP, etc.

Suits, all.

And while this sounds critical, it's actually par for the course for open source these days. OpenStack? Filled top to bottom with board members that contribute cash to secure a place. Open Daylight? Ditto. You get the idea.

While foundations have become mechanisms for corporations to collaborate on code, they've also become marketing vehicles to pretend a grassroots developer community actually exists.

But R is different. Despite today's announcement of significant corporate interest in R via the newly formed R Consortium, R remains a project of the data geeks for the data geeks. The R Consortium may make some corporate donors feel like they have a seat at the R table, but the real seats are filled by those who code.

Yes, the R Consortium has John Chambers on its board, but this John Chambers never spent a day at the helm of Cisco: He's a professor at Stanford.

Oh, and there's a professor at Oxford, a scientist with the International Agency for Research on Cancer, a professor with the Department of Statistical & Actuarial Sciences at the University of Western Ontario, and a senior financial engineer with Ketchum Trading.

Heck, the board isn't even based in Silicon Valley, but instead is seated in Austria.

Yes, I stepped in it

Given R's non-corporate nature, I shouldn't have been surprised by the community's response to my recent suggestion that Microsoft owned the R code and should consider contributing it to a foundation.

To paraphrase the response: "There already is a foundation -- and the foundation, not some corporation, owns the code!!"

I'll admit that I was taken aback. After all, my primary contention was that re-implementing R to get around its underlying GPL license would sacrifice R's great community. I hadn't bothered to take the time to dig into the provenance of the R code, as it wasn't material to the bulk of my article. Why wasn't that community grateful for the compliment, and indifferent to my eensie weensie faux pas?

Because the essence of R is important to its community, and that essence can't be purchased by any corporation.

The R I believe in isn't short of cash, mister

Not that R is devoid of corporate interest. Far from it.

In announcing the R Consortium to work in concert with the R Foundation, Linux Foundation Executive Director Jim Zemlin notes, "Millions of data scientists and academic researchers use R language every day and want to collaborate with their peers to share visualization and analysis techniques."

Many of those millions work for companies that fine-tune billions upon billions in assets according to work they do in R, from optimizing market trades to tailoring product pricing.

Then there are the Tibcos and other vendors that build R into their products. Microsoft, as intimated above, actually bought the company formed to help advance R, Revolution Analytics, to gain access to deep R development expertise and has been embedding it throughout its product lines.

Nor is monetizing R out of bounds.

For example, Revolution Analytics Chief Community officer David Smith told me that "You can extend R with packages, which can be under any license, per the R Foundation." So while the community guards the code under the GPL, companies can extend it with proprietary code provided they follow the foundation's rules.

That's fair, and a good balance between the overarching community goals and narrower corporate interests.

Back to open source's future

This is what makes R different -- and what makes it so successful. It's helped by corporate cash, but not controlled by it. The R Foundation has a host of benefactors and donors but its list looks very different from most.

It also has members which, true to the Foundation, are voted in based on merit, not cash: "New ordinary members are selected based on their non-monetary contributions (code, effort ... ) to the R project. The initial set of ordinary members at establishment of the organization consisted of the members of the ‘R Development Core Team.'"

We need more open source projects like this. That is, projects that manage to become big and influential without becoming consumed by corporate cash.

Sometimes you can have both, of course. Take Linux. There is a Linux Foundation, but it doesn't run the Linux project. Linus Torvalds does, and he really doesn't care very much what Intel wants in the project. He cares about great code. So Linux is able to collect large quantities of corporate cash without being driven by it.

But R and Linux are exceptions, not the rule. The new rule is money talks, but it can't ultimately overcome the lesson we learn from R: Community rules, and real open source communities are fueled by code, not cash.

Copyright © 2015 IDG Communications, Inc.