Engineering teams from Facebook, Google, LinkedIn, and Twitter have begun work on WebScaleSQL, a project designed to address the challenges of running MySQL at Web scale.
"Delivering a reliable, personalized experience for the 1.23 billion people who use Facebook requires an expansive and large-scale infrastructure," says Steaphan Greene, a Facebook software engineer. "As Facebook has grown, so has our MySQL deployment, and that deployment is now one of the largest in the world. Along the way, we've learned and benefited from code changes made by the MySQL community."
To pay that back and help other organizations run MySQL at scale, engineering teams from the four companies began working to share a common set of changes to the upstream MySQL branch with the intention of making their contributions available via open source. Greene says the collaboration expands on existing efforts of the MySQL community, and WebScaleSQL will continue to track the upstream branch (currently MySQL 5.6).
"Our goal in launching WebScaleSQL is to enable the scale-oriented members of the MySQL community to work more closely together in order to prioritize the aspects that are most important to us," Greene says. "We aim to create a more integrated system of knowledge-sharing to help companies leverage the great features already found in MySQL 5.6, while building and adding more features that are specific to deployments in large-scale environments."
Greene says that those contributing to WebScaleSQL have already produced some results, including the following:
- An automated framework that will run and publish the results of MySQL's built-in test system (mtr) for each proposed change.
- A full new suite of stress tests and a prototype automated performance testing system.
- Changes to existing tests and the structure of some existing code to avoid problems where otherwise safe code changes caused unnecessary conflicts or caused tests to fail.
- Performance improvements, including buffer pool flushing improvements, optimizations to certain types of queries, support for NUMA interleave policy, etc.
In addition, Greene says the Facebook team is currently working on a number of other improvements, including an asynchronous MySQL client; moving Facebook's production-tested versions of table, user and compression statistics into WebScaleSQL; and adding the Logical Read-Ahead mechanism, which production tests have shown produce large and quantifiable speed improvements (up to 10x) to full table scans like nightly logical backups.
Greene says he encourages others dealing with similar challenges to join in the WebScaleSQL efforts.
"We will keep all our WebScaleSQL work open to create a useful branch for others within the MySQL community who are focused on scale deployments," he says. "We'll continue to follow the most up-to-date upstream version of MySQL. As long as the MySQL community releases continue, we are committed to remaining a branch -- and not a fork -- of MySQL."
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.
Read more about data management in CIO's Data Management Drilldown.