Netflix is the big Kahuna of a Web media businesses, with 33 million subscribers in more than 40 countries. As Netflix's "watch now" streaming service has grown, the company has had to rethink its data and storage strategies to cope with ballooning workloads managed in the cloud. Today, the company is nearly complete in its migration from Oracle to the NoSQL database Cassandra, improving availability and essentially eliminating downtime incurred by database schema changes.
Netflix launched its streaming service in 2007, using the Oracle database as the back end. "We had a single data center, which meant we had a single point of failure," explains Adrian Cockcroft, cloud architect at Netflix. "We were approaching limits on trafﬁc and capacity. Now that people can watch Netflix streaming programming from their phones, from Wii devices, Roku boxes, and many others, the demand for availability increases all the time. We have more customers every quarter, more customers are using streaming, and they're using streaming at a greater rate."
[ Also on InfoWorld: Why Netflix is embracing Python over Java | Which freaking database should I use? | Download InfoWorld's Big Data Analytics Deep Dive for a comprehensive, practical overview of this booming field. ]
Data has grown as fast as the customer base, Cockcroft says: The number of API requests in January 2011 was 37 times higher than requests in January 2010. The company knew that outages or poor-quality streaming could drive away customers. "We knew we had to get out of the data center, so we could keep running and keep growing," Cockcroft says.
In 2010, Netflix began moving its data to Amazon Web Services. The next step was to replace its Oracle database with Apache Cassandra, an open source NoSQL database known for its scalability and enterprise-grade reliability. "For us, the problem with a central SQL database was that everything was in one place ii which is only convenient until it fails," Cockcroft explains. "And because these databases are expensive, you tend to put everything in there. Then everything fails at once."
Another problem was that schema changes required system downtime. "Every two weeks, we'd have at least 10 minutes of downtime to put in the new schema," he explains. "The limitations of a SQL database impacted our availability and scalability."