Power Cost Reduction in Distributed Data Centers

Power Cost Reduction in Distributed
Data Centers
Yuan Yao
University of Southern California
Joint work: Longbo Huang, Abhishek Sharma,
LeanaGolubchik and Michael Neely
IBM Student Workshop for Frontiers of Cloud Computing 2011
Paper to appear on Infocom 2012
Background and motivation
• Data centers are growing in number and size…
– Number of servers: Google (~1M)
– Data centers built in multiple locations
• IBM owns and operates hundreds of data centers worldwide
• …and in power cost!
– Google spends ~$100M/year on power
– Reduce cost on power while considering QoS
Existing Approaches
• Power efficient hardware design
• System design/Resource management
– Use existing infrastructure
– Exploit options in routing and resource management of
data center
Existing Approaches
• Power cost reduction through algorithm design
– Server level: power-speed scaling [Wierman09]
– Data center level: rightsizing [Gandhi10, Lin11]
– Inter data center level: Geographical load balancing [Qureshi09,
Our Approach: SAVE
• We provide a framework that allows us to exploit options in
all these levels
Server level
Data center level
Inter data center level
Job arrived
volatility of
power prices
StochAstic power
Job served
Our Model: data center and workload
• M geographically distributed data centers
• Each data center contain a front end server and a back end cluster
• Workloads Ai(t) (i.i.d) arrive at front end servers and are routed
to one of the back end clusters
Our Model: server operation and cost
• Back end cluster of data center i contain Ni servers
– Ni(t) servers active
Service rate of active servers: bi (t) ∈[0, bmax]
Power price at data center i: pi(t) (i.i.d)
Powerusage at data center i:
Power cost at data center i:
Our Model: two time scale
• The system we model is two time scale
– At t=kT, change the number of active servers Nj(t)
– At all time slots, change service rate bj(t)
Our Model: summary
• Input: power prices pi(t), job arrival Ai(t)
• Two time Scale Control Action:
• Queue evolution:
• Objective: Minimize the time average power cost
subject to all constraints on Π, and queue stability
SAVE: intuitions
• SAVE operates at both front end and back end
• Front end routing:
– When
, choose μij(t)>0
• Back end server management:
– Choose small Nj(t) and bj(t) to reduce the power costfj(t)
– When
is large, choose large Nj(t) and bj(t) to stabilize
the queue
SAVE: how it works
• Front end routing:
– In all time slot t, choose μij(t) maximize
• Back end server management: Choose V>0
– At time slot t=kT, choose Nj(t) to minimize
– In all time slots τ choose bj(τ) to minimize
• Serve jobs and update queue sizes
SAVE: performance
• Theorem on performance of our approach:
– Delay of SAVE ≤ O(V)
– Power cost of SAVE ≤ Power cost of OPTIMAL + O(1/V)
– OPTIMAL can be any scheme that stabilizes the queues
• V controls the trade-off between average queue size
(delay) and average power cost.
• SAVE suited for delay tolerant workloads
Experimental Setup
• We simulate data centers at 7 locations
– Real world power prices
– Possion arrivals
• We use synthetic workloads that mimics MapReduce jobs
• Power Cost
Power price
consumption of
active servers
consumption of
servers in sleep
Power usage
Experimental Setup: Heuristics for comparison
• Local Computation
– Send jobs to local back end
• Load Balancing
All servers
are activated
– Evenly split jobs to all back ends
• Low Price (similar to [Qureshi09])
– Send more jobs to places with low power prices
• Instant On/Off
– Routing is the same as Load Balancing
– Data center i tune Ni(t) and bi(t) every time slot to minimize its
power cost
– No additional cost on activating/putting to sleep servers
Experimental Results
relative power cost reduction as
compared to Local Computation
• As V increases, power cost reduction grows from ~0.1% to
• SAVE is more effective for delay tolerant workloads.
Experimental Results: Power Usage
• We record the actual power usage (not cost) of all
schemes in our experiments
• Our approach saves power usage
• We propose atwo time scale, non work conserving control
algorithm aimed atreducing power costin distributed data centers.
• Our work facilitating an explicit power cost vs. delay trade-off
• We derive analytical bounds on the time average power cost and
service delay achieved by our algorithm
• Through simulations we show that our approach can reduce the
power cost by as much as 18%, and our approach reduces power
Future work
• Other problems on power reduction in data centers
– Scheduling algorithms to save power
– Delay sensitive workloads
– Virtualized environment, when migration is available
• Please check out our paper:
– "Data Centers Power Reduction: A two Time Scale
Approach for Delay Tolerant Workloads” to appear on
Infocom 2012
• Contact info:
[email protected]

similar documents