Overprovisioning and overallocation often leads to overspending in the datacenter. There's certainly something to be said (such as, "I don't want to lose my job") for ensuring that your facility has sufficient power, computing hardware, and backup equipment to maintain precious uptime. However, the trade-off can be thousands -- if not millions -- of dollars wasted on excess gear that eats up precious white space and costly watts of electricity.
Datacenter operators are tackling the problem is numerous ways, such as turning down or eliminating CRAC units, hunting down zombie servers, and employing virtualization to reduce machine count. Some are taking their efforts a step further, employing an emerging technology called power capping that boosts server density and saves on space and power.
As the name implies, power capping refers to the practice of limiting how much electricity a server can consume. Typically, the power allocated to a server is steady and fixed, based on a worst-case scenario: how much power the server needs when running at maximum utilization. In reality, most servers in the datacenter don't come close to reaching maximum utilization. That means that most datacenter operators are setting arbitrarily low limits on how many servers they can deploy.
Stuffing the power envelope
Let's say you have a max power envelope of 1MW. For the sake of argument, let's say 400,000 watts of that megawatt goes to power, cooling, storage, and networking equipment, which leaves 600,000 watts to allocate to your servers. You decide to stick to the power allocation printed on the nameplates of your machines, which is 400W. That means that your budget allows 1,500 1U servers in your datacenter.
But what if, in reality, your servers never need more than an average 300 watts of power to maintain their required performance level? If there was a way to ensure you didn't exceed your 1MW power limit, you could pack 2,000 1U servers into the same amount of space -- with little to no need to add power and cooling infrastructure.
That's where the power capping comes in. With power capping and complementary management software, you could ensure that no server draws more than 300 watts at once. Some companies, such as Intel, have developed power capping technology that can be applied at the rack level.
Intel's power capping magic is called Intel Dynamic Power Node Manager Technology. Designed for servers running Intel's Xeon 5500 chips, Node Manager is an out-of-band power management policy engine, embedded in the Xeon's chip set, that works with BIOS and OS power management (OSPM) to dynamically adjust platform power to achieve maximum performance per watt at the server level.
Among its features is Dynamic Power Monitoring, which measures actual power consumption of a server platform, providing real-time power-consumption data. The Platform Power Capping feature sets platform power to a targeted power budget while maintaining maximum performance for the given power level. The Power Threshold Alerting feature monitors platform power against a targeted power budget. When the target power budget cannot be maintained, Node Manager sends out alerts to the management console.
Intel also has developed a software add-on to Node Manager called Intel Datacenter Manager, designed to monitor and control power for a group of servers. Intel Datacenter Manager depends on Intel Dynamic Power Node Manager. Datacenter Manager features include group-level monitoring of power consumption, log querying for trend data, group power limiting, and group-level power alerts and notifications.
Baidu racks up savings
Baidu, China's largest search company, reports success using Intel's power-capping technology. Based on a proof-of-concept study of Baidu's application of the technology, the companies report that a datacenter using the technology could save up to 40 watts per system -- without performance impact. This translates into as much as 20 percent additional datacenter capacity within the same rack-level power envelope, and a potential rack-density improvement of 20 to 40 percent.
Baidu's predicament before deploying power capping was pretty typical: It was leasing racks at a datacenter, and each rack was power limited. The company sought to save money by cramming as many machines as possible into the fewest number of racks.
Testing Intel's power-capping wares started at the individual node level. Step one was to measure power consumption and performance and various levels of CPU use to identify the sweet spot for power management -- that is, where the server achieved the maximum power reduction with the minimum performance loss. The testing revealed that the optimal workload was reached at a CPU utilization of around 50 to 60 percent with peak power at about 300W per server. Power consumption tended to stick at around 290W, with some spikes to 300.
The next step was to test two levels of power capping: 260W and 200W. The minimum 40W power reduction was needed in order to add another server to each 5U rack, thus achieving the goal of increasing server density. The cap could not go below 200W, as that was the approximate amount of power the server needed simply to idle.
From these tests, it was determined that they could add another server per rack by reducing platform power consumption to as low as 250W, all the while maintaining an acceptable performance level.
The next test was at the rack level using Datacenter Manager. Rather than capping the power level of individual servers, they developed a power-capping policy of 750W for a three-server rack (250W per server). Without power capping, the rack level of power consumption hit 900W. With the power capping in place, power consumption was clumped down close to 750W. There was some fluctuation due to the dynamic nature of the Baidu app's workload, but overall, performance remained at an acceptable level, despite the cap on power.
In total, Intel and Baidu managed to reduce power consumption by as much as 50W at the server level, without significant impact on workload performance. At the rack level, they saw a potential for around 20 percent more capacity within the same power envelope and without performance impacts.
Intel isn't the only vendor out there offering power-capping technology. AMD, IBM, Dell, and HP have added power-capping features to their server-management software. I can see datacenter operators embracing this sort of technology with caution, as no one wants to be responsible for crippling business-critical applications. (The same could be said for the practice of powering down servers when they aren't in use.) At the same time, as datacenters continue to face limits on space and power, and as power-capping technology matures and proves its value in real-world applications, more companies should be amenable to at least giving it a test run.
The million-dollar datacenter question: Tier II, or not Tier II?
The Uptime Institute has devised a useful cost model to help plan the budget for a new datacenter
Green IT numbers don't lie
Stark statistics shed light on the needs driving green computing
Powering down servers is a calculated risk
Vendors cite minimal risk to shutting down servers, yet some companies practice it to save energy
AMD brings power capping to new 45nm Opteron line
PowerCap Manager lets admins save energy by setting limits on processor speed and voltage
IBM aims at trimming electric bills with Active Energy Manager
Management tools aimed at reining in datacenter energy waste -- and associated high costs -- continue to hit the market
HP injects power-capping tool in Systems Insight Manager
Feature limits energy use on a per-machine basis, potentially reducing overall datacenter power consumption by 70 percent, HP says