While NoSQL may be getting all the buzz, in many cases an old fashioned relational database, such as MySQL, may work just as well if not better. That was the message from a number of MySQL users who presented their stories at Oracle's first MySQL Connect conference, held Saturday and Sunday in San Francisco.
On Sunday, engineers and executives from Twitter, PayPal and Verizon discussed their use of MySQL or MySQL Cluster. In each case, MySQL was being used for large high volume, distributed workloads that have been increasingly thought of as the province of NoSQL data stores such as MongoDB and Cassandra.
[ Discover what's new in business applications with InfoWorld's Technology: Applications newsletter. | Keep up with the latest approaches to managing information overload and compliance in InfoWorld's Enterprise Data Explosion Digital Spotlight. ]
"A lot of people think they have a big data problem, and a lot of times they don't," said Daniel Austin, who is the chief architect for PayPal. "They have an urge to find a big data solution to a problem, because it looks good."
Austin admits that the typical relational database system (RDMS) architecture has not scaled very well to meet the influx of new data in many organizations, but he cast doubt on the idea that NoSQL data stores provide the answer. "You don't have to give up your relational model to have big data," Austin said.
Paypal itself deals with large amounts of data for its global payment transaction system, which must be fast and accurate. The system must be able to manage 100TB of fixed storage, and once data is written to the system, it must be able to be read from anywhere else in the world in less than a single second. This can be a challenge given that the fastest data can travel between the two most distant places on earth is about 67 milliseconds, thanks to the hard limit of the speed of light. "So that puts a lower bound of how fast things can go," Austin said. The company uses Amazon Web Services, spread across six different locations. All the live data is handled in-memory, rather than being immediately written to disk.
PayPal uses MySQL Cluster for a number of reasons, most notably that true High Availability (HA), which ensures that all the data entered into the system is captured immediately. Another advantage MySQL Cluster offered was scalability.
"We had to think about how to build the architecture a little bit," Austin said. The approach they used, called architectural tiling, was designed to "build a system that scales to an arbitrary number of users. And we did that with SQL," Austin said. " We feel confident we can scale beyond 100 million users no problem."
Another big user of MySQL is Twitter. Currently, Twitter has over 140 million active users, who issue about 400 million Twitter messages every day, all of which have to be stored, indexed and annotated with metadata. The company uses a modified version of MySQL 5.5 to handle this load. The company has six full time database administrators, to maintain "a few thousand database servers," said Jeremy Cole, the chief database architect for Twitter. The company also employs one full-time MySQL developer.
Cole spoke to why Twitter uses MySQL even when NoSQL databases would seemingly be better suited for such a heavy workload. "It comes down to a few key points," Cole said.
One is basic familiarity. "We have very extensive at-scale knowledge. We know how MySQL works internally. We know how to upgrade it, we know how to fix bugs and push out new releases," Cole said.