Facebook shouldn't be afraid to rewrite its code, and neither should you

When your application's load outstrips its ability to scale, sometimes the best solution is to start over from scratch

In most fields, there's a special kind of shame associated with having to start a project over from scratch. As an architect, for example, the last thing you want to hear is that one of your buildings will be torn down and rebuilt from the ground up because it can no longer support the weight of its tenants.

According to computer scientist and entrepreneur Michael Stonebraker, however, that's more or less the situation confronting Facebook right now. Only in Facebook's case, the "building" is a Web application, and the problem isn't concrete or steel girders; it's MySQL.

[ Neil McAllister reveals the most dangerous programming mistakes. | Get software development news and insights from InfoWorld's Developer World newsletter. | And sharpen your Java skills with the JavaWorld Enterprise Java newsletter. ]

In 2008, Facebook famously disclosed that it had deployed a whopping 1,800 production MySQL servers, and the social networking giant's growth has only accelerated since then. As of now, Stonebraker says, Facebook has split its MySQL data store into some 4,000 shards, with 9,000 caching servers running 24/7 just to keep up with the load.

Facebook's struggles with MySQL are far from secret. In fact, the company maintains a MySQL at Facebook profile page with updates on its continuous quest to keep the open source database running efficiently at such a massive scale.

But to hear Stonebraker tell it, that quixotic journey should have ended long ago. He describes being saddled with Facebook's complex MySQL installation as "a fate worse than death." The only way out of this purgatory, he says, is for Facebook to "bite the bullet and rewrite everything." In other words: Tear this building down.

Naturally, Stonebraker's comments have ruffled a lot of feathers in the Facebook camp. But for the sake of argument, let's assume he's right. Let's assume Facebook really is nearing the limits of what MySQL can possibly do, and that the most effective solution at this point would be a total rewrite.

So what's the big deal?

Try and try again
Facebook would hardly be the first high-traffic website to attempt a major technology upgrade late into its life. In fact, a two-stage rollout has become something of a tradition among Web startups. The list of sites that have undergone a major technical revamp after launch reads like a veritable who's who of the Web's biggest names.

Remember when Twitter sounded like a silly idea? Its founders must have been skeptical at first, too, which may have been why they chose to build the site using the Ruby on Rails framework. Rails is known for its fast development times; according to O'Reilly Media's Tim O'Reilly, "Powerful Web applications that formerly might have taken weeks or months to develop can be produced in a matter of days."

As Twitter's user base grew, however, it must have soon become evident that a few days' coding wasn't going to cut it. In 2009, Twitter engineers announced that the company had begun migrating key systems from Ruby to Scala, a language that runs on the Java virtual machine (JVM), as a way around bottlenecks in the Ruby runtime environment. Today, Twitter still uses a mix of Ruby and Scala, but the effort to migrate performance-sensitive systems to the JVM continues (search being the most recent candidate).

Even before Ruby on Rails, developers were building sites using other Web frameworks designed for rapid application development. Remember ColdFusion? Now an Adobe product, the venerable platform doesn't get much truck with developers these days, but in 2003 it allowed a small group of colleagues to develop a social networking competitor to Friendster in just 10 days. The name of their site: MySpace.

MySpace's user base exploded, and in 2005 the social network and its parent company were acquired by Rupert Murdoch's News Corp. for $580 million. That same year, with its back-end servers buckling under the weight of its newfound popularity, the company began transitioning its systems from ColdFusion to .Net, with help from New Atlanta Communications' BlueDragon migration tool.

Start small but stay agile
Imagine how much time, money, and effort could have been saved had Facebook hitched its fortunes to an enterprise-class database instead of MySQL, or if Twitter or MySpace had built their services using Java or .Net to begin with, rather than bumbling around with Ruby on Rails or ColdFusion. But of course, that's all hindsight. The truth is, there are plenty of good reasons to launch a site using the tools you have available at the moment, even if it means you'll have to rewrite most of your code later.

For starters, it's easy to criticize a popular site, but for every Web application that succeeds, countless more fail. It simply doesn't make sense to invest big dollars on the most robust, scalable tools possible when your idea has yet to be proven in the marketplace.

Second, at the early phases of a Web project, developer efficiency is often even more important than the efficiency of your infrastructure. The longer it takes to bring a site to market, the more opportunity competitors have to outflank you. When your budget is modest, it makes sense to choose tools that allow the smallest staff possible to get the most done in the least amount of time, which is exactly what tools such as Rails and ColdFusion offer.

Third, technology itself evolves. Who's to say your platform of choice won't outgrow today's performance issues, allowing your current design to scale as your site grows?

Fourth, no matter how meticulously you plan, not every contingency can be foreseen. Reengineering your code base gives you the opportunity to correct past mistakes, such as problems with your security model or your database schema. It's possible you might end up doing extensive rewrites even if you don't switch platforms.

Finally, a website simply is not like a building. Investing in Web infrastructure is not the same as investing in steel and concrete. Building Web applications is a business that's intrinsically more agile and flexible than building real-world objects, which is a big part of what makes it such an exciting business to be in. So why not act like it?

Beware axe-grinders
As for Michael Stonebraker, he has an axe to grind. As the co-founder and CTO of VoltDB, Stonebraker would like nothing better than to see Facebook rewrite its code to free itself of its dependence on MySQL. That would only lend fuel to his arguments that "old SQL" products, such as MySQL, should be "sent to the home for retired software" and that new startups should choose products like VoltDB to avoid Facebook's "fate worse than death."

Personally, I take Stonebraker's arguments with a hefty grain of salt, even if Facebook does end up rewriting substantial portions of its software. Given how Twitter and MySpace both weathered their own growing pains and how successful all three sites have been (despite MySpace's recent turn in fortune), most startups can only dream of failing so spectacularly.

This article, "Facebook shouldn't be afraid to rewrite its code, and neither should you," originally appeared at InfoWorld.com. Read more of Neil McAllister's Fatal Exception blog and follow the latest news in programming at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Copyright © 2011 IDG Communications, Inc.

How to choose a low-code development platform