Contact Centre Case Study
Contact (s)
Download the pdf - Contact Centre Case Study
Click on the image to enlargen it.
Minitab Project Report
Assume that we have two call centres A and B.
A sample of 60 people from call centre A and 80 from call centre B was taken.
Call duration data has been collected since Jan 02.
Every person has 25 data points over this time (ie a sample of call duration).
There are a number of potential key Xs:
- call centre (A vs B)
- time of day (data sampled hourly 08:00 to 16:00)
- type of equipment (modern vs old)
- promotional activity (Y vs N - promotional activity takes place every 3 months for 1 month at both call centres)
- experience of operatives (new vs experienced, where experienced = 1 year of more)
The spec is single sided; target Y1=30 secs, upper limit = 35 secs for a call.
Let's look at the shape and location of the data for A and for B:
Clearly, the data is non-normal.
The mean is 25.6 secs per call, which is within spec.
Some calls took less than 15 secs, whilst others took over 33 secs.
50% of calls took more than 25.9 secs.
The 95% confidence limits for the true population mean is 25.4 to 25.8 secs.
The 95% confidence limits for the true population standard deviation (spread) is 3.55 secs to 3.82 secs.
This seems larger than for B (see below) - but is this statistically significant?
The data is again non-normal for B.
The mean is 29.96 secs per call, which is within spec.
Some calls took less than 20 secs, whilst others took over 38.5 secs (outside spec)
50% of calls took more than 30 secs.
The 95% confidence limits for the true population mean is 29.82 to 30.10 secs.
This looks signficantly larger than the mean for A, but is it statistically significant?
The 95% confidence limits for the true population standard deviation (spread) is 3.1 secs to 3.3 secs.
Let's test the hypothesis about spreads of the times for A and B:
Both data sets are non-normal so we must use Levene's p-value: testing the hypothesis that variation in calls centres is different.
Clearly, the difference in variation between call centres A and B is highly statistically significant; variability of call duration is higher in call centre A than in call centre B - why?
Given this assertion, is a difference in standard deviation of 0.5 secs PRACTICALLY significant to YOU?
What can we say about the difference in mean time to deal with a call between call centres?
Since the two data sets are independent, we can do a 2-sample t test.
We cannot assume equal variances, as we've just seen, so we'll have less power to discern a difference - but this may not be a problem in this case.

Two-sample t test:
| call cent | N | Mean | ST Dev | SE Mean |
| Atime | 1500 | 25.6 | 3.68 | 0.095 |
| Btime | 2000 | 29.96 | 3.23 | 0.072 |
Difference = mu (Atime) - mu (Btime)
Estimate for difference: -4.364 sec
95% CI for difference: (-4.598, -4.130) sec
T-Test of difference = 0 (vs not =): T-Value = -36.57 P-Value = 0.000 DF = 2982
The difference in mean time to deal with the calls is clearly not due to 'chance' sampling error; the 95% confidence interval for the difference in mean times is -4.6 to -4.1 secs. for A-B, so calls DO take less time in call centre A than call centre B -on average 4.4 secs less per call!
So, we have a statistically significant difference and a practically significant difference.
The two results, for variability and mean differences gives us the right, statistically, to confidently investigate for reasons for these differences what drives them?
If we can find this out then we have the potential to reduce average call times, and improve their consistency.
The standard deviation of the difference in means is 0.119 secs (the Root Sum of Squares of the SE Mean values above).
This gives rise to a very precise difference in means.
If we take the 'worst case' for the difference between A and B as -4.1 secs, then this represents a fraction 4.1/25.6, i.e. approx. 16% longer than the average time taken by a person from call centre A to deal with a call.
This could be a productivity improvement opportunity.
Can we explain why these distributions are non-normal and have the spread that they do?
There may be reasons why call centre B takes longer per call than A, and/or there may be opportunities to significantly reduce the mean/variation of both A and B times . . .
Let's investigate the influence of the Xs . . .

There are clearly differences associated with specific operators . . . why are operators 1 to 15 taking on average about 5 secs longer than the rest?
If we look at the experience of each operator we get a possible explanation . . .

New operators in call centre A (i.e. those that have been there less than 1 year) are taking longer, on average, than Experienced operators.
Inexperienced operators, however, appear to have less variability than experienced ones.
We could again test this using a Test of Equal Variances if we believed that we could find a reason for it if it really was the case . . . we won't do this here.
NB: It is very important to realise that, whilst it is tempting to say that this 'proves' that experience is the CAUSE of the difference, it is not correct to say this.
To prove causality one needs a properly designed experiment.
What about the importance of the Equipment - Modern vs Old?

It appears that Old equipment has a small, but perhaps significant effect on time taken.
Let's again do a 2 sample t test.
First we have to check normality of each dataset, and homogeneity of variance:
Both are non-normal, so again we have to use Levene's test for Equal Variances:
We see that we cannot refute the equality of variances for type of equipment; for all intents and purposes the populations have the same spread.
Do they have the same mean?
Two-sample T for call centre A
| Equipment | N | Mean | ST Dev | SE Mean |
| Modern | 625 | 24.34 | 3.52 | 0.14 |
| Old | 875 | 26.5 | 3.53 | 0.12 |
Difference = mu (Modern) - mu (Old )
Estimate for difference: -2.163
95% CI for difference: (-2.525, -1.801)
T-Test of difference = 0 (vs not =): T-Value = -11.72 P-Value = 0.000 DF = 1498
Both use Pooled StDev = 3.52
The answer is no! Old equipment adds between 1.8 and 2.5 secs to the call duration. This difference is significant to over 99.9%.
What about time of day for call centre A?

It appears that calls take less time to answer at 8am and 4pm.
These times coincide with starting and finishing!
Perhaps operators are fresh first thing in the morning, and tend to curtail conversations when it's time to go home?
Is this difference significant?
To test for this we need to perform Analysis of Variance (ANOVA).
Before doing this, however, we need to test for normality of each dataset and test for Equality of Variances, since these are assumptions of ANOVA.
It turns out that most populations for time of day for call centre A are non-normal, so we must use Levene's test of Equality of Variances:

Levene's p-value of 0.036 gives us over 95% confidence that at least one of the variances is different to the rest: A look in the session window in Minitab gives us the actual values for the 95% confidence intervals for the population standard deviations for each time:
Test for Equal Variances: call centre A


At the times we're interested in (8am and 4pm) the variation in call duration is higher than most of the other times - is this because some operators are more prone to a start/finish time effect than others?
(This could be investigated on an individual-by-individual basis if required.)
So, now we have all the information we need to evaluate whether the observed difference in average call duration at 8am and 4pm is statistically significant:
One-way ANOVA: Atime versus timeofdayA

We can see that it is!!
We are over 99.9% confident that this difference exists in the 'population' of operatives in call centre A.
The average difference is about 3 secs.
This represents a large opportunity for reduction in answering times for the other times of day.
What about promotional activity ?
Let's again focus on call centre A:

It would appear that promotional activity reduces the average time to deal with a call, but does not reduce the variability.
We can again check this hypothesis out with a 2-sample t test.
Cutting to the chase (one would do normality tests and Equality of Variance tests as a matter of course, but we'll ignore these for brevity's sake, and assume equal variances for the test . . .

There is a highly statistically significant difference in mean time (over 99.9% confident in a difference).
The difference could be as great as 2.37, or as little as 1.5 secs per call lower when there is promotional activity than when there is not.
Whether this is practically important to you is up to you to decide.
TRENDS WITH TIME
Let's plot the call duration for call centre A as a function of time, and investigate presence of special causes.
Because there is no natural subgrouping, an Individuals Moving Range chart is appropriate.

In this case, this chart doesn't tell us a great deal, except that the limits are 'wide' relative to the specification of +/- 5 secs, and there are a number of data points which have higher variability than the 'norm' (see points labelled '1' in the lower chart).
It may be instructive to construct one of these charts for each operator, to monitor overall mean performance and variability relative to their past performance, and relative to other operatives . . .
Control charts are an excellent way to see the effect of special causes, like promotions etc., and hence help to quantify the effects of various Xs 'independently'.
Example below is given for operatives 1 and 2 for call centre A:

These charts are intended to be done in real time, so that when a point falls 'out of control' it can be investigated and actioned immediately.
Perhaps there was a particularly difficult customer, or the call was 'cut off' etc.
One can also plot a control chart showing times for call centre A and call centre B on the same chart . . .

The Individual Value plot (I chart) shows the difference between call centres for average call duration, and the Moving Range (MR chart) shows that the variability between call centres is similar - although there are disproportionately more points from call center A above the upper control limit - indicating that the variation in call times for A is higher than that for B. (This was confirmed by the Test of Equal Variances at the beginning of the analysis.)
What is the 'sigma' value for call centre A?
The data is non-normal, so a capability study based on the normal distribution shouldn't be used.
(When the processes are improved so that inconsistencies between operators, between equipment, between times of day etc. are removed, the distribution of call duration is far more likely to be normally distributed, so that more accurate process capabilities can be established.)
We can count the proportion of times that we exceed the upper spec limit of 35 secs and convert this to a Sigma value:
For call centre A there are 0 defects to this spec out of a total sample of 1500.
This equates to a performance of 6 Sigma +. We cannot accurately compute Sigma with zero defects.
We can, however, compute an upper limit for the proportion defective:

i.e. a 95% upper confidence limit for the proportion defective in call centre A is 0.2%.
(There is no significance attached to the test of p=0.5 in this case study.) This enables us to put a lower 95% confidence limit on the Sigma value of around 4.5 Sigma.
For call centre B there are 80 defects to this spec out of a total sample of 2000 (ie defect proportion of 4%.
This corresponds to a long term sigma value of 1.75, i.e. a 'Sigma' value of approx. 3.25.
Similarly, a 95% confidence limit for the proportion defective, given 80 out of a sample of 2000, is:

So, the true proportion defective in call centre B could be between 3.2 and 4.95% - i.e. Sigma between approx 3.1 and 3.4.
NB: Given the above analysis, there is a HUGE opportunity to improve the mean and variation of time per call by addressing the reasons for the differences highlighted above; viz. experience (training), type of equipment (modernise), time of day (make performance consistent) etc.
The statistics would suggest that call centre B can learn things from call centre A to bring its mean time down, and call centre A can learn things from call centre A to bring its variability down.
ONE CAN REPEAT ALL THE OTHER ANALYSES DONE ABOVE FOR CALL CENTRE B.
This won't be done here since the steps are identical.
CALL CENTRE CAPACITY :
HOW MANY CALLS CAN BE HANDLED PER DAY, WITH 95% CONFIDENCE ?
We can use the mean time per call and the standard deviation of call times to estimate this.
Let's work it out for some fictitious data, making certain assumptions:
Available time for an operator per day = 6 hours = 21600 secs Distribution of time per call is Normal, with a mean of 30 secs and a standard deviation of 3 secs.
(Other distributions can be used if necessary; worst case - a Uniform distribution can be used of width MAX VALUE - MIN VALUE, with a standard deviation of this width divided by the square root of 12.)
If we assume that all available time is taken handling calls, and each call has a duration as above, then a sequence of N calls in the available time will add up to a mean of N*30 seconds, with a standard deviation of SQRT(N*(3)^2) secs. (if call durations are independent of each other).
We require a 95% confidence that N calls can be handled in 1 day, if no special causes arise.
Hence we need to solve (approximating 1.96 by 2, and assuming is N>>30):
30N + 2*SQRT(9N) = 21600
This arises from a linear addition of N means, where mean=30 secs, plus 1.96 'aggregate N standard deviations' from this aggregate mean.
The aggregate standard deviation, assuming independence between samples, is simply the 'Root Sum Squares' of all the individual standard deviations.
So,
aggregate standard deviation = sqrt (N * 3^2) = sqrt (9N)
Simplifying the expression:
30N + 6*SQRT(N) = 21600
ie
5N + SQRT(N) = 3600
Using Excel to plug in various values for N, the solution is N=715 calls per day.
If each call took exactly 30 sec, then this amount of calls represents about 5 hours 40 mins of continuous activity i.e. the uncertainty due to variation in call time with a standard deviation of 3 secs, is about 20 min per day.
One can also use Crystal Ball to simulate the addition of any number of Normal (or Uniform etc) distributions to enable N to be calculated such that 95% of the resulting aggregate distribution lies to the left of 6 hours continuous work (or whatever the available time per day is).
The addition of 715 individual Normal distributions with means of 30 and standard deviations of 3 was simulated.
The resulting distribution had a mean of 21417 secs (5.95 hours), with a standard deviation of 79 secs, in accordance with the Root Sum Squares hand calculation method.

This is the result of a 2000 run simulation for 715 calls.
One would then move the right hand pointer (above the number 21,633.22 in the diagram) to the left to find the number of seconds which covers 95% of the distribution; this is the 95% upper confidence limit for the time taken to handle 715 calls per day.
% of calls answered
Let's assume that out of a random sample of 1000 calls registered as coming in, 875 were answered and 125 were lost calls.
The target is 90%, with a tolerance of +/- 10%.
The 95% confidence limits for the true population proportion answered is given by formula or Minitab:

i.e. the true % answered will be between 85% and 90%, with over 95% confidence.
The p-value indicates that we are very confident (99%) that the proportion answered is NOT 'ON TARGET' - the upper 95% confidence limit is less than 90%.
It is 'in tolerance', however.
We can also answer questions like
'Given the current average % answered, how many samples would I need to take in order to be X% confident in seeing an improvement of Z% in the %answered?'
(We can answer similar questions for improvements in time to deal with calls.)
© onesixsigma.com 2003-2008. All rights reserved.










