IT confronts the datacenter power crisis

As energy costs escalate, conserving resources tops the list of challenges for today's IT managers

When David Young told his colocation provider late last year that his online applications startup, Joyent, planned to add 10 servers to its 150-system datacenter, he received a rude awakening. The local power utility in Southern California wouldn’t be able to provide the additional electricity needed. Joyent’s upgrade would have to wait.

“We had to find creative ways to get through this period,” says Young, whose urgent need for more computing bandwidth forced him to contract with a second colocation provider.

Tales such as Young’s have become increasingly common during the past few years. The cost and availability of electricity is emerging as a key concern for IT managers when building datacenters, in many cases trumping such traditional considerations as seismic stability, purchase price, and quality of life for employees.

Google, for example, has watched its energy consumption almost double during the past three generations of upgrades to its sprawling computing infrastructure. It recently unveiled a major new datacenter site in a remote part of Oregon, where power costs are a fraction of those at Google’s home base in Silicon Valley. But cheap power may not be enough. Last year, Google engineer Luiz André Barroso predicted that energy costs would dwarf equipment costs — “possibly by a large margin” — if power-hungry datacenters didn’t mend their ways. Barroso went on to warn that datacenters’ growing appetite for power “could have serious consequences for the overall affordability of computing, not to mention the overall health of the planet.”

Keeping cool in a crisis

IDC analyst Michelle Bailey says U.S. companies spent approximately $5.8 billion powering servers in 2005 and another $3.5 billion or more keeping them cool. That compares with approximately $20.5 billion spent purchasing the equipment.

“It’s a big problem,” Bailey says of the skyrocketing energy bills. “Over the lifecycle of the system, actually powering and cooling the system starts to become almost equal to the price.”

Rather than any single, readily fixed cause, the current IT power crisis is the result of a combination of subtle trends. At its core is what Jerald Murphy, COO and director of research operations at Robert Frances Group, refers to as the “dark underbelly” of Moore’s Law: As processor performance has doubled every couple of years or so, so too has power consumption and its side effect, heat.

That wasn’t a problem decades ago, when the latest and greatest chip consumed 8 watts, instead of the 4 watts of its predecessor. But as power requirements slowly grew over time, things changed, until we reached a tipping point of sorts in the past two or three years. Today’s chips require anywhere from 90 to 110 watts — twice as much power as the chips of just a couple of years ago. They also run hotter, which drives up the cost of datacenter cooling. And if that wasn’t enough, the growing use of blade servers — once viewed as a panacea to power and space limitations — is only making things worse.

“With blade servers and high-density servers, we’re packing more and more equipment into a smaller space, and that’s creating heat issues,” says John Welter, vice president of Valtus Imagery Services, a provider of computer-intensive graphical maps. In many cases, blades make it impossible to cool a fully populated room, meaning IT managers need a new datacenter even though the current one is only half full.

Compounding the power problem is the explosive growth of new IT services, as offices heed the call to use IT to automate sales, invoicing, and other business processes. That increases the number of servers in a typical datacenter. And, of course, no discussion of power would be complete without mentioning the price of oil, which has tripled since 2002.

“It’s kind of the perfect storm of IT power consumption,” Murphy says. “You’ve got more applications being consolidated into a smaller space, with chips that are hotter; and the servers using them are taking up less space, and there are more of them together.”

Mouths to feed

The first step in reducing power consumption costs is to take inventory of every piece of equipment on the datacenter floor, paying careful attention to both the amount of power each device consumes and the heat that it dissipates. This survey will allow IT managers to understand what percentage of their datacenter’s available power is being consumed by existing equipment and accurately predict how long it will take until demand outstrips capacity. The results will have a direct bearing on how to proceed. If a datacenter has 18 months before it maxes out, there’s plenty of time to devise a fix. A six-month window, on the other hand, will call for more drastic action.

One of the most obvious ways to reduce energy costs is to buy gear designed with power efficiency in mind. Pick a vendor — AMD, Dell, IBM, Intel, Hewlett-Packard, Sun Microsystems, or any other — and chances are it has a slew of new products that use fewer kilowatts to get the job done.

“All of these people have moved to address a major problem, which is you just can’t power these things,” says Miles Kelley, vice president of marketing at 365 Main, a datacenter host, speaking of the top-tier server vendors. “They’ve all shown up in our datacenter, so they must all be doing something right.”

Click for larger view.

Joyent was able to tame its energy mess by replacing its fleet of old Xeon servers with Sun systems. Approximately 25 of the new boxes, or about 20 percent of its machines, feature the power-efficient Sparc T1 processor. Because of its ability handle 32 threads at a time, Sun executives say the T1 is the processing equivalent of a motor coach that can transport large numbers of passengers for less gas than dozens of smaller vehicles can.

Young says a third-party consultant who visited Joyent’s datacenter estimated the startup will save $1,200 per year for every T1-based server it uses, bringing the total saved from those machines to $30,000. The finding, which was part of an audit commissioned by Pacific Gas & Electric — the utility that serves Joyent’s new datacenter — came as something of an epiphany for Young. “The fact that all servers weren’t created equal when it comes to power consumption came onto our radar,” he says. Sweetening the deal was a rebate of almost $2,500 — or $989 per T1 box — that Joyent received from PG&E for the purchase.

The Sparc T1 servers were a logical choice for highly threaded applications such as Joyent’s database, identity, and e-mail servers. But they didn’t stack up as well on single-threaded applications, such as those based on Ruby on Rails. For the remainder of its revamped fleet, Joyent relied on Sun boxes built with AMD Opteron CPUs. Young estimates that the Opteron-based machines, which round out the remaining 80 percent of his revamped datacenter, deliver approximately 35 percent more throughput than his older servers while consuming the same wattage.

Still other options abound. While AMD and Sun started preaching the virtues of power efficiency before the topic was in vogue, Intel, after enduring unfavorable power comparisons between its Xeon and AMD’s Opteron for years, has come roaring back with its Woodcrest design, which roughly doubles the performance of the previous top-of-the-line Xeon while drawing 35 percent less power.

Meanwhile, HP and IBM have attempted to address power and space problems through their blade designs, but as mentioned, heat dissipation can become an issue with high-density racks. One solution is to reduce the total number of servers in deployment; even when servers are running efficient chips, analysts say too often the machines are underutilized.

“People for a long time have adopted a one-application-per-server model,” IDC’s Bailey says. What results are datacenters with hundreds of machines, each of which runs at a small fraction of its capacity. The problem is that a server that’s only 10 percent utilized draws almost as many watts as one that’s running at 80 percent capacity.

Bailey and other IT advisers say companies can reap big savings by consolidating a handful of small jobs onto a single box through virtualization technologies. On the extreme end of this trend, IBM also advocates the use of mainframes running virtualization software, which it says allows many of its customers to replace dozens of juice-thirsty servers with a single machine (albeit an expensive one).

Non-IT offenders

Replacing old machines with more energy-efficient gear and consolidating servers are great initial steps, but real power savings in the datacenter can’t be accomplished until managers tackle a snarl of inefficiencies that don’t fit as neatly into the traditional purview of the IT department.

That’s because for every watt that a server in the typical datacenter consumes, another 1 to 1.5 watts are burned up by nonserver gear, says Jonathan Koomey, staff scientist at Lawrence Berkeley National Laboratory. To solve the problem, IT managers must address long-standing shortcomings in air-conditioning systems, power equipment, and other gear that has not traditionally been a responsibility of their department.

“There’s really nobody on the IT side that’s seeing an electricity bill and saying, ‘Gee, I’m responsible for making that meter spin,’ ” says Richard Hodges, principal of GreenIT, a consultancy that advises clients on how to reduce IT power costs.

Bill Clifford, CEO of Aperture Technologies, a supplier of software that helps manage datacenters, agrees. His advice to IT managers: “Go find out who your facilities liaison is and become really good friends. A smart CIO today is going to want to have those types of people on their team to anticipate needs and not simply react to problems.”

The biggest problem with most cooling systems is that datacenters typically have way more than is needed, says Neil Rasmussen, CTO with American Power Conversion, which provides products and services for powering datacenters. He says many datacenters use gear that’s rated for three times as many servers as they’re actually serving. “A lot of people think, ‘I’ll invest in power and cooling for the future,’ ” he says. “In the meantime, you have this big 8,000-horsepower engine and it’s burning fuel.”

Overbuilding made more sense in past decades because installation of air-conditioning systems could require the removal of entire building walls. Now the gear is more modular, which allows designers to add units more gradually, as they are needed.

Another strategy to reduce cooling costs is to abandon the traditional row-oriented approach, in which cold air passes through ducts on a datacenter floor to cool the areas surrounding a bank of servers. Because the air coming out of the vent mixes with much warmer air in the room, this method requires cooling systems to be set to temperatures of 45 degrees or lower, drawing a considerable amount of power.

Instead, Rasmussen encourages the adoption of rack-oriented cooling, which blows cold air directly into server racks, so there is less opportunity for it to mix with the ambient air. Under the rack-oriented approach, air is typically cooled to only 70 degrees, significantly reducing the load on the cooling system. Rack-oriented cooling has the additional benefit of delivering air that has more moisture in it, in many cases eliminating the expense of powering and housing humidifiers.

Other cooling remedies include simple changes to a datacenter’s floor layout so that a server’s exhaust vents in one row aren’t blowing into the intake vents of the next row — which is a problem Rasmussen estimates plagues as many as 30 percent of datacenters. He also suggests companies located in regions where it gets cold use cooling systems that take advantage of those temperatures. So-called economizer settings can save a bundle by drawing on air from outside, eliminating the need to run compressors.

Thinking green

There are other ways to tame the power monster in the datacenter. One possibility that may not be as far off as many think is the use of DC (direct current) to power datacenter.

Even AC (alternating current)-powered datacenters can benefit from thinking smarter. One unnecessary draw on power comes from power supplies designed to handle much larger loads than the ones they are currently shouldering, says Chris Calwell, an analyst at Ecos Consulting. He says it’s not at all uncommon for a server that draws 150 watts to be fitted with a 300-watt power supply. Then, in the interest of redundancy, the IT manager will add a second 300-watt power supply and share the load. Companies could save plenty just by making power supplies more efficient, he says.

Last, there’s the option of setting up new datacenters, or relocating large chunks of existing ones, in regions where power is cheap, a strategy 365 Main’s Kelley advocates to many of his customers.

“Some people come to the table saying, ‘My million-dollar power bill is too expensive,’ ” he says. “And we say, ‘Diversify to Phoenix because power is a third of the cost of power in New York.’ In that same breath we say, ‘Plus, you are now mitigating risk by diversifying the back end.’ ”

Whatever route customers take, the days when they can afford to ignore the problems of powering and cooling their datacenters are over. Not long ago, IT managers’ performance was measured by the availability and reliability of the machines they maintained. Their job was to ensure the IT requirements of their organization were met without interruption, and they were given broad leeway in how they made that happen. No more. At the rate things are moving, some datacenters will need more than 20 megawatts, enough to power a town of 25,000 homes.

“If there are any CIOs or senior VPs of datacenter operations that haven’t had the lightbulb go on around power issues, it can’t be far behind,” Aperture’s Clifford says. “The concern is that, left unregulated, the consumption of power by datacenters in America is going to be an issue that will impact the gross national product.”

Copyright © 2006 IDG Communications, Inc.

How to choose a low-code development platform