||Add To My Personal Library
June 5, 2009
Vol.31 Issue 16|
Page(s) 36 in print issue
Optimizing The Sleep Cycle
How Researchers Are Tackling The Power-Hungry Habits Of Idle Servers
Servers are a restless bunch by design. Like insomniacs with a nagging Red Bull habit, they rarely sleep through idle periods, instead waking every few hundred milliseconds to see if there’s a job waiting for them—regardless if there wasn’t a job waiting last time and if there isn’t a job available this time. This constant sleep/wake cycle in turn wastes tremendous amounts of energy in data centers and has spurred researchers to seek more efficient methods for saving power during idle server time.
• Due to inefficient design, servers constantly transition from a sleep period to a full-bore computing state, in turn consuming exorbitant amounts of power.
• Many server components do indeed feature low-power states, but they don’t often exist in the same hardware configuration. If they do, software hasn’t been available—until now—that can coordinate those states.
• Although sleep modes save power, they can also lead to inefficient use of other server-related architecture, so researchers are examining methods for handling computation tasks in nontraditional ways such as moving them to other parts of the planet where electrical rates are lower.
“In the typical data center, only about a fifth of servers are actually actively doing something at any particular time,” explains Thomas Wenisch, an assistant professor in the University of Michigan’s Department of Electrical Engineering and Computer Science. “The problem with the servers that are built today is that when they’re idle, they still consume about 60% of the power that they consume at peak, when they’re completely loaded, doing useful work. So the vast majority of the energy in these data centers is being left on the table. We’re just wasting that energy.”
Because servers must be ready to throw their full horsepower at tasks at any given time, finding a sleep solution that doesn’t negatively impact overall data center performance is a highly challenging task. However, Wenisch and other researchers are making major headway in the quest to conserve server power.
All Together Now
Wenisch, along with students David Meisner and Brian Gold, recently unveiled research that investigated a method for putting idle servers to sleep. Based on extensive analysis of 600 servers, the research found that the average server idle period is hundreds of milliseconds, while the average server busy period is just tens of milliseconds. When transitioning between these sleep and work states, servers typically move from a relatively deep slumber to full steam and back again, even if there’s no job available or the job doesn’t require full power.
To address this problem, Wenisch and his team have devised a software/hardware tandem, called PowerNap and RAILS (Redundant Array for Inexpensive Load Sharing), that detects these brief periods where the server has nothing to do and shuts down as much of the system as possible. Although data centers generally opt for components that offer peak performance, there remain components that provide low-power modes, such as the self-refresh mode in DRAM, energy-saving modes in CPUs, and others. PowerNap works to provide coordination between these components at the operating system level to enable a rapid-state transition the moment Power-Nap detects the system is idle.
“Based on our survey of the individual device technologies, if you were to select components for a server based on their energy-saving features, you should be able to transition into a sleep mode on millisecond time scales and go from hundreds of watts for a blade server to on the order of 10 or so watts for all of these components in their lowest sleep state,” Wenisch says.
Struggling With Supply
But there are other challenges to overcome outside of device coordination. According to Wenisch, the slowest component—and the one in which existing sleep states just don’t measure up—is the server’s power supply. Today’s high-end blade systems employ a small number of high-powered, enterprise-class power supplies that can deliver kilowatts of power. Because each of the blades in a rack draws a couple of hundred watts when they’re idle, those power supplies operate at their highest capacities, regardless of what the servers are doing.
“The thing is, when you reduce the current draw on a power supply—when you operate it well away from its peak capacity—the energy efficiency of converting AC to DC power drops off precipitously,” Wenisch explains.
This is where Wenisch’s RAILS approach comes in. It’s no accident that the RAILS acronym looks suspiciously similar to RAID, a concept in which many inexpensive disks are used in place of a fewer number of more expensive disks. With RAILS, a data center would deploy multiple, smaller (think 500W) power supplies in place of one 2,250W power supply.
“The key idea is to cut power supplies in and out as servers wake up and go to sleep so that at any particular time, the number of power supplies that are powering your soluare active and awake,” Wenisch says.
The fundamental research for the Power-Nap and RAILS solution is complete, and Wenisch and his team predict they’re about a year away from building and engineering a prototype along with industry partners.
Looking Outside The Box
Ask Pete Beckman, director of the Leadership Computing Facility at Argonne National Laboratory (www.anl.gov), about power in data centers, and he’ll point to how this nation’s early settlers viewed water as a limitless resource. But as we now know that water in the United States is not limitless, we also understand that data centers can no longer assume that there is an endless supply of electrical power. Beckman oversees the Intrepid, a 557-teraflop supercomputer that churns through scientific calculations around the clock.
“We have been optimizing the energy consumption. [The Intrepid’s] architecture is already about twice as energy-efficient as other supercomputer designs, but there is still a lot more optimizing that can be done.” Although the power consumed by servers can be conserved by not turning them on or putting them into low-power mode, Beckman says that these techniques lead to inefficient use of other infrastructure such as transformers and chillers.
“Electricity should never be wasted, so if there is no work to do, servers should automatically spin down to the lowest sleep state possible. However, the best solution is to improve our techniques at predicting and then managing server load,” Beckman says. “For jobs that are latency- and bandwidth-tolerant, we can even imagine moving the computation to the other side of the planet, where nighttime electrical rates might be lower. The cloud computing revolution is just beginning, and as we commoditize computation with virtual machines, we also make it possible to be more fluid and mobile and open other opportunities for cost and energy savings.”
by Christian Perry