SmartDrill Home

   Search the SmartDrill site


M/M/S Queueing Model


This hypothetical example describes a simple M/M/S queueing model in which we adjust queue length and number of servers to handle customers during high-traffic periods.  [Note: if you are not familiar with Kendall Notation for queueing models, then before continuing you should read our introductory queueing optimization page.]

The initial model settings are intended to handle modest off-peak traffic. But then we will adjust the parameters to try to handle peak traffic loads. Here are the starting values:

  • Arrival rate: 30 customers per hour
  • Average service time: 3 minutes per customer
  • Number of servers: 2
  • Service rate: 20 customers per hour
  • Queue length: 10

Note: This model could apply to many qualitatively different types of situation, including ones where there is no actual spatial queue where customers stand.  It could, for example, apply to a telephone help desk or inbound telemarketing call center just as well as a physical setting such as a retail store.  In those former cases, the time on hold on the phone would represent the time in queue; and the queue length would be the number of calls that the system will accept and put on hold before giving a busy signal on the caller's phone or playing a recorded message asking the caller to hang up and try again later.

These starting values give us the results shown in the following table:

MM2 30 arrival 10 queue 2 servers
As the table indicates, under these model conditions the system is able to handle the traffic quite well, and is utilizing 74% of capacity.  Traffic intensity shows that the arrival rate of 30 customers per hour is 1.5 times the service rate of 20 customers per hour.  On average, at any given time there will be about three customers in the system and about 1.6 customers waiting in the queue.  Although just over 60% of customers will have to wait in the queue, wait times are relatively brief.  There is less than a 1% chance that at any given moment the system will be full and someone will balk, or refuse to wait in the queue, and thus not enter the system at all.

The following graph shows the probability distribution for number of customers in the system:

MM2 30 arrival 10 queue

And here is the graph of the expected balking rate under various traffic conditions ranging from 30 to 60 customers per hour, assuming only two servers on duty and a queue length of 10:

MM2 balking rate 30 arrival 10 queue

As we saw in the previous table, the current balking probability (probability of a full system) is less than 1%, so things are moving smoothly.  But if traffic picks up, then more customers will balk.  We would like to make sure that we have sufficient capacity to keep the balking rate relatively low, thus ensuring an acceptable level of customer satisfaction.

Now let's look at what happens when traffic starts to pick up, and we have an average of 45 customers per hour arriving at the system:

MM2 45 arrival 10 queue 2 servers
Now we see that the system is definitely being taxed.  We are at 96% of capacity; traffic is arriving at a rate that is 2.25 times the service rate; there are an average of about 8 customers in the system and about 6 customers waiting in the queue.  So about 94% of customers are forced to wait, and the balk rate has risen to more than 14%, which is unacceptable.  We need to make an adjustment to the system.

First let's look at what happens if we increase the queue length from 10 to 15 to encourage more people to wait in line instead of leaving:

MM2 45 arrival 15 queue 2 servers  

This hasn't helped much: the balk rate fell only slightly, from 14.43% to 12.74%; and now the system is 98% full, and wait times have increased dramatically.  We've run out of physical space, so we can't increase the queue length beyond 15; but even if we could increase it more, that wouldn't really solve our problem.  So it's time to add another server.

Here's what happens when we increase the number of servers from two to three under the current traffic conditions:

MM2 45 arrival 15 queue 2 servers
Capacity utilization has now dropped back down to 75%, and the balk rate is back down to a very low level.  We could even reduce the queue length back down to 10.  Here is the result of reducing the queue length when we also have three servers:

 MM2 45 arrival 10 queue 3 servers

Capacity utilization remains good, and the balk rate is still under 1%, which is fine.  So it is clear that maximizing queue length doesn't matter nearly as much as adding a server.  In terms of sensitivity analysis, this means that the system is much more sensitive to a 50%change in the number of servers than to a 50% change in queue length.

But we're not out of the woods yet.  Traffic continues to increase until it reaches a peak of 70 customers per hour.  With three servers and a queue length of 10, let's see what happens:

 MM2 70 arrival 10 queue 3 servers  

Now we're back up to 97% capacity utilization, wait times have increased, and the balk rate is now a whopping 16.57%.  So it's time to add a fourth server, which has the following result:
MM2 70 arrival 10 queue 4 servers
Utilization has dropped to 85% and wait times are acceptable.  The probability of balking is just under 3%, which is pretty good.  We can increase the queue size back up to its maximum of 15 to try to reduce the balk rate a bit more.  Here is the result:

 MM2 70 arrival 15 queue 4 servers

This works out well: utilization stays about the same, wait times are reasonable, and the balk rate is now down to 1.36%.

By keeping track of historical data over a sufficiently long period of time, we can continue the modeling process so that in the future we will be able to anticipate traffic intensity at various times of day and make the appropriate adjustments before the balk rate gets out of hand.

Back to the main Queueing Optimization page.

· Marketing Analytics 
· Market Research
· Operations Research
· Risk/Decision Analysis
· Project Management


SSL certification seal from Comodo