### Capacity Setting and Queuing Theory

```Capacity Setting
and
Queuing Theory
BAHC 510
Lecture 6
US Election
Nov 6, 2012
Capacity and Resources
• A key lever for improving patient flow.
• How do we measure capacity?
– What is the capacity of a 20 seat restaurant?
– A 16 bed ward?
• Capacity is a RATE
– Customers/hour
– Patients/day
• We can view a 16 bed ward as a queuing system with 16 servers
– What is the capacity of a bed?
– Does this analogy apply to the restaurant?
• A system is composed of resources with capacities.
– Often we use the expressions “resource” and “capacity”
interchangeably (hopefully without confusion)
How Much Capacity is Needed?
or How Many Resources are Needed?
Ward Occupancy
30
Midnight census
Surge capacity
25
Base capacity
20
15
0
100
200
Day
300
400
Capacity tradeoffs when demand is variable
• Too much capacity or too many resources = idleness
• Not enough capacity = waits
• The resource manager must trade these off taking into
account system objectives and available resources
• Should we set capacity equal to demand?
–
–
–
–
What does this mean?
This is called a balanced system
It works perfectly when there is no variation in the system
It works terribly when there is variation! Why?
• Once behind, you never can catch up.
– Queuing theory quantifies these tradeoffs in terms of
performance measures.
Queuing Models
• (Mathematical) queuing models help us set capacity
(or determine the number of resources needed) to
meet:
– Service level targets
– Average wait time targets
– Average queue length targets
• Queuing models provide a more precise alternative
to simulation
• They provide insights into how to plan, operate and
manage a system
• Where are there queues in the health care system?
A single server queuing system
Server
Buffer
• A queue forms in a buffer
• Servers may be people or physical space
• The buffer may have a finite or unlimited capacity
• The most basic models assume “customers” are of one type
and have common arrival and service rates
A multiple (N) server queuing system
Server
Buffer
.
.
.
Server
Server
Several parallel singer server queues
Buffer
Buffer
Buffer
Server
Server
Server
Parallel Queues vs. Multiple server Queues
• Provide examples of multiple server queues
(MSQs)
• Provided examples of parallel queues (PQs)
• In what situations would each of these
queuing systems be most appropriate?
Why?
Networks of queues
• Most health care systems are
interconnected networks of queues and
servers with multiple waiting points and
heterogeneous customers.
– Provide some examples.
– Often we model these complex systems with
simulation.
• But in some cases we can use formulae to get
results
Queuing Theory background
• Developed to analyze telephone systems in
the 1930’s by Erlang.
– How many lines are needed to ensure a caller tries
to dial and obtains a “line”.
– Depending on the system configuration an arriving
customer may either be blocked or enter a queue
and wait for service.
• Now they are applied to analyze internet
traffic, telecommunications systems, call
centers, airport security lines, banks and
restaurants, rail networks, etc.
Queues and Variability
• There are two components of a queuing system
subject to variability
– The inter-arrival times of “jobs”
– The service times or LOS
– Why are these variable?
• We describe this variability by
– Mean
– Standard deviation
– Probability distribution
• Usually the normal distribution doesn’t fit well
• Often an exponential distribution fits well
– If we know its rate or mean we know everything about it.
The exponential distribution
• P(T ≤ t) = 1 – e-λt
– The quantity λ is the rate.
– The mean and standard deviation of the exponential distribution is 1/λ.
– The median is ln(2)/ λ = .693/ λ
• Exponential distributions don’t allow negative times and have a small
probability of long service times.
• Example; Patients arrive at rate 4 per hour.
– The mean inter-arrival time is 15 minutes.
– The median inter-arrival time is 10.39 minutes.
– What is the probability that the time between two arrivals is less than
10 minutes (1/6 of an hour)
• P( T ≤ 1/6) = 1 – e-4∙(1/6) = 1- e-2/3 = .487.
• The exponential distribution underlies queuing theory.
• A queue with exponential service times and exponential inter-arrival
times and N (FCFS) servers and an infinite waiting room is called an
M/M/N queue.
Capacity management and queuing
systems
• Capacity management involves determining the
number of servers to use and the size of the waiting
rooms.
• Examples
–
–
–
–
How many long term care beds are needed?
How many porters are needed?
How many nurses are needed?
How many cubicles are needed in an ED?
• Some healthcare systems have no buffers; all the
waiting is done outside of the system or in
upstream resources.
– ALC cases waiting for LTC beds
Analyzing a queuing system
Outputs
Inputs
Arrival Rate
Service Rate
Number of Servers
Buffer Size
Population Size
Queue
Analyzer
Armann Ingolfsson’s Queuing
Calculator
Capacity Utilization
Wait Time in Queue
Queue Length
Blocking Probability
Service Levels
Some Fundamental Quantities
• Inputs
– The arrival rate per hour: λ
– The service rate per hour: µ
– The number of servers: s
• Can be 1 or more
– The buffer size: K
• Can be finite or infinite
• Derived Quantities
– The offered load: λ/ µ = R
– Example; λ = 100 calls/hr and µ = 5 calls/ hr
• Then the offered load is 20 (this quantity is unit less)
• This means the system needs at least 20 servers to meet its workload
• Another important quantity is the traffic intensity ρ = R/s
– It gives the offered load per server
– In example if there are 25 servers (N=25) ; ρ =20/25 =.8
– So servers should be busy 80% of the time.
• If the traffic intensity exceeds 1; the system will be unstable.
Examples of Queuing Systems
• Walk-in clinic with 6 seats and 2 physicians
– s=2
– K=6
• Long term care facility with 100 beds
– s= 100
– K=?
• A Finite Capacity Loss System
–
–
–
–
Model for an (old-fashion) phone system
s servers
K= 0
When all servers are busy, system is blocked and
customers are lost
• A Congestion System
– s servers
– K= ∞
– When all servers are busy; customers wait
Performance Measures
•
•
•
•
•
•
•
•
•
Capacity Utilization
Probability the system is empty
Average waiting time (in queue) – Wq
Flow Time - Average Total Time in System – W
Average queue length – Lq
Average number of jobs in the system - L
Probability that a customer waits for service
Probability that there are k customers in the system
Service Level – Probability that a customer waits less than T
time units for service.
An Example - M/M/1 Queue
•
•
•
•
Assume exponential inter-arrival time and service time distributions, infinite capacity
and 1 server (s=1)
Calculations below are based on analytical expressions available in most operations
research texts on OR.
Customers arrive at rate 4 per hour, mean service time is 10 minutes.
– Service rate is 6 per hour
– System utilization = Probability the server is occupied =  = 2/3.
– Safety capacity = service rate – arrival rate = 2
– P(System is empty) = 1-  = 1/3.
– P(k in the system) = k(1- ) = (1/3)(2/3)k
– Average Time in system= 1/safety capacity = ½ hour
– Average Time in queue = Average time in system – average service time = ½ - 1/6 =
1/3 hour
– Average Queue Length = 2/(1- ) = 4/3
Suppose arrival rate increases to 5.9 customers per hour.
– Then  =5.9/6 = .9833
– So P(System is empty) = .0167; Average time in system = 10 hours and Average
number of customers in the system = 58.9!
• An M/M/s queue is the same as an M/M/1 queue except that there may
be more than one server.
– In this model, there is a single buffer and s servers in the resource pool.
– Jobs are processed on a FIFO basis.
– When there are more than s jobs in the system, the buffer is occupied and
waiting for service occurs.
The Erlang-C formula gives the probability an arriving job has to wait.
• An M/M/s/K queue is an M/M/c queue with a finite buffer of size K.
– There are at most K + s customers in the system.
– When the buffer is filled, the system is blocked and customers are lost.
• QUEUECALC computes performance measures for
–
–
–
–
M/M/s queues
M/M/s queues with a finite buffer size
M/M/s queues with a finite population size
M/G/1 queues
• In addition for a fixed T
– For specified s it computes the percentage of jobs waiting less than T time units
– It computes the number of servers needed to achieve a specified service level
• How many servers are needed so that 90% of jobs wait no more than 10 minutes for service.
Problem 1
•
Patients arrive at rate 5/hr. They require on average 1 hour of treatment.
– What is the offered load?
•
How many service providers do we need to ensure that the average wait time
is 20 minutes or less?
– Assume a large waiting room.
•
•
Observe that we require more than 5 servers to ensure a stable system.
Run “The Waiting Line Analyzer” to find
– For 6 service providers - Average number in queue is 2.94 and average wait time in
queue is .5875 hours or 35.25 minutes
•
•
Note that with 6 service providers the probability a customer waits which equals the probability
all 6 are occupied occurs 58.75% of time.
The capacity utilization is 83%
– For 7 service providers – Average number in queue is 0.81 and average wait is .1621
hours or 9.28 minutes.
•
•
•
Note that with 7 service providers the probability a customer waits which equals the probability
all 7 are occupied occurs 32.41% of time.
The capacity utilization is 71%
Observe the trade-off between capacity utilization and service!
More on Problem 1
• Service Levels
– Suppose our target service times are 6 and 10
minutes – fill in the following table
P(Wq ≤ 6)
Servers
6
7
8
9
10
P(Wq ≤ 10)
Capacity
Utilization
More on Problem 1
Servers
P(Wq ≤ 6)
P(Wq ≤ 10)
Capacity
Utilization
6
.47
.50
83%
7
.73
.77
71%
8
.88
.90
63%
9
.95
.96
56%
10
.98
.98
50%
Still more on Problem 1
• Let’s explore relationship between (traffic intensity)
utilization, queue lengths and wait times
– Assume 5 servers increase arrival rate to 5.
Arrival Rate
Utilization
Wait time in
queue (hrs)
Queue Length
4
80%
0.55
2.22
4.5
90%
1.52
6.86
4.9
98%
9.50
46.56
4.99
99.8%
99.50
496.5
• Conclusion – as traffic intensity increases to 1 queue lengths
and wait time increase rapidly
Problem 2 – A small walk in clinic
•
•
•
•
A walk in clinic has 3 doctors;
Average time spent with a patient is 12 minutes (5/hr)
Patients arrive at rate of 12 per hour
How many chairs should we have in the waiting room so
only 5% of patients are turned away?
• Solution
– Assume first an infinite waiting room
• This shows average queue length is 2.59
– Now try a model with a finite waiting room.
• With 3 chairs 9% balk and 52% wait
• With 4 chairs 7% balk and 55% wait
• With 5 chairs 5% balk and 58% wait
– In this last case average waiting time is .038 hours
• This seems too fast.
Problem 3 – Blocking in a Hospital Ward
• Bed requests arrive at the rate of 3 per day.
• Patients remain in beds for about 5 days
• How many beds are required so that the probability a patient is not
admitted on arrival is less than 10%?
•
– This is a finite capacity queuing system with no waiting room
– Service rate = 1/5 = 0.2 patients per day
– Offered load = 3/.2 = 15 so we need at least 15 beds.
Model this as a finite capacity queuing system with no waiting room – we
want the blocking probability to be less than 0.1.
• With 15 beds 18% are blocked
• With 16 beds 14% are blocked
• With 17 beds 11% are blocked
• With 18 beds 9% are blocked
– In this case (s=18) the capacity utilization is 76%
• Graph gives occupancy distribution or census.
• This probability is computed using the Erlang-B formula
How can queuing theory improve porter
scheduling?
Tuesday (Centralized Operation)
(Oct 2003 - Feb 2004)
25
5
Current Capacity
Average Demand
Difference
3
15
1
0
10
-1
-2
5
-3
-4
Time Slot
Assumption: Porters
27 handle 3.3 trips/hour
23:00
22:00
21:00
20:00
19:00
18:00
17:00
16:00
15:00
14:00
13:00
12:00
11:00
10:00
9:00
8:00
7:00
6:00
5:00
4:00
3:00
2:00
-5
1:00
0
0:00
# of Porters
2
# of Excess Porter / Hr
20
4
Implications of queuing formulas
• As the safety capacity vanishes, or equivalently, the traffic
intensity increases to 1:
– waiting time increases without bound!
– queue lengths become arbitrarily long!
• In the presence of variability in inter-arrival times and service
times, a balanced system will be highly unstable.
• These formulas enable the manager to derive performance
measures on the basis of a few basic descriptors of the
queuing system
– The arrival rate
– The service rate
– The number of servers
• When the system has a finite buffer, the percentage of jobs
that are blocked can also be computed
Summary
• When the manager knows the arrival rate and service
rate, he/she can compute:
–
–
–
–
The average number of jobs in the queue.
The average time spent in the queue.
The probability an arriving patient has to wait.
The system utilization.
• This can be done without simulation!
• This information can be used to set capacity or explore the
sensitivity of recommendations to assumptions or
changes.
• Thus queuing theory provides a powerful tool to manage
capacity.
Don’t Match Capacity with Demand
• If service rate is close to arrival rate then there will be long wait times.
– Recall average queue length = 2/(1- )
• For traffic intensity near 1, queue length will be very small.
Queue Length vs. Traffic
Intensity
120
100
80
60
40
20
0
0.5
0.6
0.7
0.8
0.9
1
Performance measure formulas
(M/M/1 queue – no limit on queue size)
•
System Utilization = P(Server is occupied) = 
– If traffic intensity increases, the likelihood the server is occupied increases
– This occurs if the arrival rate increases or the service rate decreases
•
P(System is empty) = 1- 
•
P(k in system) = k(1- )
•
Average Time in System = 1/ Safety capacity
•
Average Time in Queue = Average time in system – average service time
– If safety capacity decreases; time in queue increases!
•
Average Number of jobs in the system (including being served) = /(1- )
•
Average Queue Length = 2/(1- )
•
If we know safety capacity, service time and traffic intensity, we can compute all system
properties
Little’s Law holds too
number in queue = arrival rate x waiting time in queue
•
Idle Capacity And Wait Time Targets
Relationship between Wait Times and Idle Capacity
Proportion of Patients
Exceeding Wait Time Target
100
75
To ensure only 5% of patients
exceed wait time target, there will
be idle capacity 23% of the time.
50
25
0
0
20
Percentage of time there is idle capacity
40
```