At one time or another, nearly every kind of information technology has been judged and found wanting. The failures are often summed up in that most damning of epithets: “It doesn’t scale.” The reason, of course, is that at one time or another, for one reason or another, every kind of information technology has failed to scale.
Unfortunately for the victims tarred with that brush, scalability is a wildly imprecise term. Applications may be expected to scale up to massive server farms or scale down to handsets. And size is only one axis of scalability. Others include bandwidth, transactional intensity, service availability, transitivity of trust, query performance, and the human comprehensibility of source code or end-user information display.
There is no magic bullet that will slay all of these demons, but that doesn’t stop us from trying to find one. Case in point: the recent furor that erupted when Friendster, a social-networking service, switched from J2EE to PHP and improved its response time dramatically. Reacting to a long history of allegations that “scripting languages don’t scale,” advocates of PHP could now gleefully assert, “Java doesn’t scale.”
The debate generated a lot of heat but also shed some light on what PHP’s inventor, Rasmus Lerdorf, calls its “shared nothing” architecture. Because PHP is stateless, he explains, potential bottlenecks are pushed out of the Web tier and into the database tier. If you’re using Oracle, Lerdorf says, scalability is proportional to “how big a check you write to Oracle every year,” and if you’re using MySQL or PostgreSQL, “it comes down to whether you have configured replication correctly and have a nicely architected tree of database machines.”
Of course, Java can be used in a similar way. When eBay made its widely publicized switch to J2EE, the statelessness of the new architecture was cited as a critical success factor. “Part of the mandate of EJB is to be stateless,” says Sun Distinguished Engineer John Crupi, whose team helped redesign eBay. The revised architecture used stateless session beans, avoided clustering, and focused on a set of business objects backed by eBay’s highly customized database tier.
In the end, scalability isn’t an inherent property of programming languages, application servers, or even databases. It arises from the artful combination of ingredients into an effective solution. There’s no single recipe. No matter how mighty your database, for example, it can become a bottleneck when used inappropriately. Many dot-com-era Web publishers learned that lesson the hard way when their database-driven sites were crushed by the Slashdot horde.
The current blogging revolution represents, among other things, a more optimal balance between two synergistic methods: serving dynamic content from a database and serving cached, static content from a file system.
It’s tempting to conclude that the decentralized, loosely coupled Web architecture is intrinsically scalable.
Not so. We’ve simply learned — and are still learning — how to mix those ingredients properly. Formats and protocols that people can read and write enhance scalability along the human axis. Caching and load-balancing techniques help us with bandwidth and availability.
But some kinds of problems will always require a different mix of ingredients. Microsoft has consolidated its internal business applications, for example, onto a single instance of SAP. In this case, the successful architecture is centralized and tightly coupled.
For any technology, the statement “X doesn’t scale” is a myth. The reality is that there are ways X can be made to scale and ways to screw up trying. Understanding the possibilities and avoiding the pitfalls requires experience that doesn’t (yet) come in a box.