August 13, 2004

IT Myth 6: IT doesn't scale

Reality: Virtually any technology is scalable, provided you combine the right ingredients and implement them effectively

At one time or another, nearly every kind of information technology has been judged and found wanting. The failures are often summed up in that most damning of epithets: “It doesn’t scale.” The reason, of course, is that at one time or another, for one reason or another, every kind of information technology has failed to scale.

Unfortunately for the victims tarred with that brush, scalability is a wildly imprecise term. Applications may be expected to scale up to massive server farms or scale down to handsets. And size is only one axis of scalability. Others include bandwidth, transactional intensity, service availability, transitivity of trust, query performance, and the human comprehensibility of source code or end-user information display.

There is no magic bullet that will slay all of these demons, but that doesn’t stop us from trying to find one. Case in point: the recent furor that erupted when Friendster, a social-networking service, switched from J2EE to PHP and improved its response time dramatically. Reacting to a long history of allegations that “scripting languages don’t scale,” advocates of PHP could now gleefully assert, “Java doesn’t scale.”

The debate generated a lot of heat but also shed some light on what PHP’s inventor, Rasmus Lerdorf, calls its “shared nothing” architecture. Because PHP is stateless, he explains, potential bottlenecks are pushed out of the Web tier and into the database tier. If you’re using Oracle, Lerdorf says, scalability is proportional to “how big a check you write to Oracle every year,” and if you’re using MySQL or PostgreSQL, “it comes down to whether you have configured replication correctly and have a nicely architected tree of database machines.”

Of course, Java can be used in a similar way. When eBay made its widely publicized switch to J2EE, the statelessness of the new architecture was cited as a critical success factor. “Part of the mandate of EJB is to be stateless,” says Sun Distinguished Engineer John Crupi, whose team helped redesign eBay. The revised architecture used stateless session beans, avoided clustering, and focused on a set of business objects backed by eBay’s highly customized database tier.

In the end, scalability isn’t an inherent property of programming languages, application servers, or even databases. It arises from the artful combination of ingredients into an effective solution. There’s no single recipe. No matter how mighty your database, for example, it can become a bottleneck when used inappropriately. Many dot-com-era Web publishers learned that lesson the hard way when their database-driven sites were crushed by the Slashdot horde.

The current blogging revolution represents, among other things, a more optimal balance between two synergistic methods: serving dynamic content from a database and serving cached, static content from a file system.

It’s tempting to conclude that the decentralized, loosely coupled Web architecture is intrinsically scalable.

Not so. We’ve simply learned — and are still learning — how to mix those ingredients properly. Formats and protocols that people can read and write enhance scalability along the human axis. Caching and load-balancing techniques help us with bandwidth and availability.

But some kinds of problems will always require a different mix of ingredients. Microsoft has consolidated its internal business applications, for example, onto a single instance of SAP. In this case, the successful architecture is centralized and tightly coupled.

For any technology, the statement “X doesn’t scale” is a myth. The reality is that there are ways X can be made to scale and ways to screw up trying. Understanding the possibilities and avoiding the pitfalls requires experience that doesn’t (yet) come in a box.

(Return to special report)

Close

On Twitter now

Application development

Powered by Twitter

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Sign up to receive InfoWorld Resource Alerts

Subscribe to the Developer World Newsletter

Receive a weekly roundup about the art and science of software development.

©1994-2009 Infoworld, Inc.