Six Sigma Tolerance Design Case Study

For more information on this article, please fill out the form below

Contact (s)


Add specific details here
*=Required


By Andy Sleeper | Published: 01 Oct 06

Six Sigma Tolerance Design Case Study: Optimizing an Analog Circuit Using Monte Carlo Analysis.

Abstract Tolerance Design is the science of predicting the variation in system performance caused by variations in component values or the environment. This article shows how Monte Carlo simulation can be applied to predict and improve the quality of a system before even one prototype has been built. Using these methods allows new products to be developed rapidly and introduced with fewer unexpected problems. The case study in this article is a simple analog circuit. The analytical methods and optimisation process may be successfully applied to any engineering problem where a transfer function can be derived.

Overview of Tolerance Design

In general, any product or process is a system converting inputs to outputs. This is shown graphically in Figure 1.

At the center of the system is a Transfer function, which converts the inputs (X) into outputs (Y). The transfer function is a mathematical equation, which may be known, estimated, or unknown. These three types of transfer functions are common in engineering problems:

  • White box transfer functions are derived analytically, using principles of science and engineering.
  • Gray box transfer functions are estimated by simulating the behavior of the system, using computer programs like SPICE. The function itself may be too complicated to derive, or it may have no closed-form solution.
  • Black box transfer functions are estimated by observing the behaviour of a physical system. This is done by designing an orthogonal experiment, collecting the data, and estimating the transfer function using analysis of variance and linear regression methods.

This paper describes an example of Tolerance Design applied to a white-box transfer function. Whenever possible, white-box transfer functions are preferred, because they can be derived earlier in the development process, leading to faster introduction of new products.

Figure 2 illustrates an effective process for tolerance design, using these five steps. More details on these steps will be explained later, using the case study as an example.

  • Define tolerance for Y: Based on customer requirements for the system, define the widest limits on Y which provide tolerable performance for the system.
  • Develop transfer function: Derive the transfer function for the initial design of the system. Set up an Excel worksheet with formulas to calculate the transfer function.
  • Compile variation data on X: If real data is available on the X’s, compute statistics from that data, and select distributions that represent the variation seen in the data. Usually, no data is available, and an assumption is needed. When nothing is known about X, assume that it is uniformly distributed between its tolerance limits. This is a conservative assumption, because it is worse than real life, in most cases. Using Crystal Ball software, define Assumption cells for each input X, based on this information.
  • Predict variation of Y: Using Crystal Ball software, define Forecast cells for each output Y. Set run preferences and run the simulation. In a Six Sigma environment, compute capability metrics for Y, such as CP, CPK and DPMLT. If these predicted quality metrics meet Six Sigma criteria, then, stop!
  • Optimise system: If the system is not acceptable, what needs to be changed? Consider these questions:

a) Does the tolerance for Y accurately reflect customer needs?
b) Which X contributes most to variation in Y? The sensitivity chart produced by Crystal Ball tells you this. For the biggest contributor, either get some data to replace the default assumption, or choose a different component with less variation. Don’t waste time fiddling with the small contributors on the sensitivity chart.
c) If the design needs to be changed, a new transfer function must be developed. The results of the simulation and the sensitivity chart provide clues to help in your redesign effort.

Case Study

The schematic shown below is part of a 5V power supply designed to detect when the 5V voltage drops too low. When this happens, the comparator changes state, resetting the processor before it starts doing evil things.

Step 1: Define Tolerance for Y

First, what is Y? What characteristics of this circuit are we interested in? Here are three:

  • VTRIP-DOWN This is the voltage of the +5V bus when the comparator changes state, when the +5V is going down, for instance, when the power supply is shutting off. not.
  • VTRIP-UP This is the voltage of the +5V bus when the comparator changes state, when the +5V is going up, for instance, when the power supply starts up.
  • VHYST = VTRIP-UP VTRIP-DOWN For stability, the comparator circuit requires a certain amount of hysteresis.

For simplicity in this article, we will only analyse VTRIP-DOWN. If you wish to practice using these techniques, try analysing the other two Ys as an exercise!

So what are the customer requirements for VTRIP-DOWN? This circuit is buried inside a product, and appears to be far away from the customer. No customer is ever aware of this circuit, unless it fails to work properly. This circuit is a safety device, intended to prevent undesired malfunction of the digital circuitry. So the customer requirement for VTRIP-DOWN is to shut down the processor before its supply voltage goes out of range at 4.75V. Therefore, 4.75V is the lower tolerance limit.

The upper tolerance limit is set by the variation of the +5V output itself. If VTRIPDOWN is above 4.85V, and the +5V voltage is low because of load conditions or its inherent variation, the system will not work correctly.

So the tolerance limits for VTRIP-DOWN are 4.75V to 4.85V.

Step 2: Develop Transfer Function

For many problems, this step can be the most difficult. But a few simple guidelines help make this easier: not.

  • Do not include inputs which have negligible impact
  • Use new symbols to represent intermediate values
  • Keep equations short. Look for opportunities to substitute symbols for portions of the equation

For the undervoltage comparator, there are many inputs I choose to ignore. This is risky, and requires some engineering judgment. There is a risk of ignoring an input that is actually significant. So when in doubt, either leave it in, or use some other method (such as circuit simulation) to determine if the input is significant or not.

In this case, I choose to ignore the effect of the resistor in series with the reference diode. Based on the specifications of the diode, I can calculate that the effect of the resistor tolerance is in the nanovolt range, which is swamped out by the voltage tolerance of the diode. So I feel safe in ignoring this input. Likewise, the input bias current of the comparator and the load impedance of the circuit following the comparator have effects, but these are extremely small, and I ignore them. What follows is one way to derive the transfer function. In this derivation:

This last equation is the transfer function to be analysed. Figure 4 illustrates an Excel worksheet containing this formula. Here are some tips to make this process easier.


  • Enter a name in the cell to the left of each component. In the next step, Crystal Ball will automatically pick up this name for each Assumption cell.
  • Format each cell with a reasonable number of decimal places
  • Split the transfer function into small pieces to minimize errors. Here, the numerator and denominator of the fraction were calculated separately.

Step 3: Compile Variation Data on Each Input X

In the ideal world, engineers would have access to vast databases with actual measured values from samples of all these parts. From this data, we could select the most appropriate probability distribution and use that distribution for the Monte Carlo simulation. But in real life, most engineers have no data. For the first simulation in data-poor real life, I recommend assuming that each component is uniformly distributed between its specification limits. This is a conservative assumption, because it is usually (but not always) worse than real data will be. A handy way to implement this assumption with Crystal Ball is to define the tolerance limits in worksheet cells. For each X, define a uniform distribution and enter references to the cells where the tolerance limits are located. This is illustrated in Figure 5.

After defining the first assumption, use the Crystal Ball Copy Data and Paste Data functions to quickly define the rest of the assumption cells.

Step 4: Predict Variation of Y

Select the cell containing the calculated value for VTRIP-DOWN and define that as a Crystal Ball forecast cell, so that Crystal Ball will keep track of the randomlygenerated values. At this point, the spreadsheet looks like Figure 6.

Next, we must decide how many trials to run. We could pick a number out of the air, but Crystal Ball provides a better approach, called precision control. Using this feature, the simulation runs until we have enough information. In this case, I asked Crystal Ball to run until the mean and standard deviation of VTRIP-DOWN are known to within 1%, with 95% confidence. For this model, this precision was achieved after 15,500 trials, which were completed in 10 seconds on my computer. I also selected Latin Hypercube Sampling, which tends to converge faster than the default simple random sampling used by Crystal Ball. For more complicated models which require more calculation time, relaxing the precision control to 5% or more may be needed to finish the simulation in a practical time.

Figure 7 displays the frequency chart for the forecast VTRIP-DOWN. The certainty grabbers are set at the tolerance limits, 4.75 and 4.85. Clearly, this design has a problem. Based on this simulation, only 94.63% of these circuits would meet their tolerance requirements. In a Six Sigma environment, we must calculate other metrics, such as CP, CPK and DPMLT. To do this, we need the mean and standard deviation of VTRIP-DOWN which Crystal Ball predicts are 4.8049 and 0.02568, respectively. I plug these values into another spreadsheet to make the capability calculations. (This worksheet, CapMet16.xls, is available on my web site, www.OQPD.com)

This report predicts a CPK of 0.58 and a long term defect rate of 399,042 Defects Per Million Units (DPMLT). These metrics are clearly unacceptable. The shifted distributions in the chart illustrate the effects of inevitable shifts and drifts which happen during the production of a product.

Step 5: Optimise System Clearly improvement is needed.

We can revisit the tolerance VTRIP-DOWN, but for the reasons explained above, no changes to the tolerance are possible. So what is causing most of the variation in this system? The Crystal Ball sensitivity chart, shown in Figure 9, has the answer.

The biggest contributor to variation is Voffset, followed closely by R1 and R2. So the first change to the system should be to improve Voffset.

Revision 1: Better ComparatorFor a modest increase in parts cost, the LM 2903 comparator can be replaced with a LM293, which controls offset voltage to 0+- 9mV over temperature.

The organisation of the Excel worksheet used in this case study makes revision very convenient. By changing the tolerance in cell C5 to .009, the parameters of the Voffset assumption are automatically updated. After repeating the simulation with these settings, Cpk is now 0.68 and DPMLT is now 290, 947. It's better, but not good yet. The sensitivity chart in Figure 11 shows that R1 and R2 are now the big culprits. Further improvement to the comparator would not be cost-effective.

Revision 2: Using 0.1% resistors for R1 and R2It is possible (at high cost) to purchase 0.1% resistors. What if these were usedin place of R1 and R2? It’s easy to find out. Change the values in cells C6 andC7 to 0.1% and repeat the simulation.

Figure 12 shows the predicted frequency chart with the tolerance limits set as the limits of the plot. None of the trials fell outside of tolerance limits. As a result, CP = 1.44, CPK = 1.31 and DPMLT = 7,829. These numbers are better, and out of all the simulated units, none failed. But there are still two big problems with this design:

First, the odd-value 0.1% resistors are expensive, and using them creates costly problems for procurement and inventory. If these are not part of the standard parts stocked for assembly, additional equipment and setup will be necessary.

Second, this quality level is still not good enough for Six Sigma. To meet Design For Six Sigma (DFSS) standards, CPK must be 2 or greater. After a product goes into production, shifts and drifts caused by components, processes and uncontrolled environmental factors may shift the average by 1.5 standard deviations or more, without being detected. A DFSS product must be designed so that quality is good even after the average values are shifted by 1.5 standard deviations.

What is good enough? For a normally distributed process, if CPK = 2.00, then thelong term defect rate (DPMLT) is 3.4 Defects Per Million Units. That’s world-class quality for this type of product. So what can we do if the system is already too costly and still does not meet quality requirements? Redesign it.

Take a look at the transfer function, shown again here:

I regrouped the equation to illustrate the impact of the ratio R1/R2 on the result. Can we control the ratio R1/R2 and reduce cost? Yes we can!

Revision 3: Network of matched resistorsThere are resistor networks containing two resistors with tightly controlled ratio. Because the resistors are manufactured on a single die, these parts are reasonably priced. One such part contains two 10,000 Ohm resistors with 0.1% absolute tolerance, while the ratio is controlled to 1 0.025%. This part is less expensive than even one 0.1% resistor. The drawing below shows a revision of the design, using this component this correlation in Crystal Ball.

Figure 13 Undervoltage Comparator, Revision 3

So far, the system models we have used assume that all components are independent of each other. Here, we have intentionally introduced a dependency between R1 and R2. How do we set up the Monte Carlo model so Crystal Ball will simulate this dependency?

If we had a number of samples of the resistor network, we could measure them, compute the correlation coefficient between R1 and R2, and specify this correlation in Crystal Ball. But if we have no samples and no data, we must make an assumption. A reasonable assumption is that the values of R1 and R2 are uniformly distributed within their tolerance zones.

Figure 14 - Tolerance zone of R1,R2

Figure 14 illustrates the tolerance zone of these two resistors. Each part must be within 0.1% (10 ohms) of the nominal value, and the ratio is controlled to within 0.025%. So if R1 = 10,000 ohms, the tolerance for R2 is 9,997.5 to 10,002.5 ohms. One way to express this to Crystal Ball is to use the following trick: Specify R1 as 10,000 0.1% In the transfer function, replace R2 by (R1 + R2A). Specify R2A as 0 2.5 ohms. The new transfer function is shown below:

Simulating this transfer function leads to this frequency chart:

Figure 15 - Frequency chart - Revision 3

Now, CP = 1.45, CPK = 1.29 and DPMLT = 8,863, about the same as the previous revision. So cost has improved, but quality has not. To plan the next step, once again look at the sensitivity chart, shown in Figure 16.

Once again, VOFFSET is the biggest culprit, while all the resistors now have a trivial impact.

Revision 4: Define Assumption Based on Real Data

It is time to question the default assumption that each component is uniformly distributed between its tolerance limits. After all, the comparator comes from a company who publicly champions its Six Sigma program. It should be of high quality. So, a sample of 50 LM293 parts are drawn from stock, including samples from different date codes. The offset voltage is measured on all these parts. The figure below shows a histogram of this data.

The Crystal Ball Batch Fit tool may be used to select a distribution model which best fits this data. In this case, we decide to use a normal distribution, with parameters set based on the statistics of this sample: = 3 x 10-6 and s =.00207 In the spreadsheet model of the transfer function, we change the assumption for VOFFSET to a normal distribution with the parameters listed above, and repeat the simulation.

The results are shown below:

Now, CP = 2.43, CPK = 2.15 and the predicted long-term defect rate is 0.3 DPM.

Figure 19 illustrates that even with a 1.5-sigma shift added to this process, quality levels are extremely good.

Summary

In this article, an analog circuit design is used to illustrate the power of Tolerance Design techniques and Monte Carlo simulation. The initial design proved to be unsatisfactory, and through a series of revisions, we generated a new design of extremely high quality at reasonable cost. Here are the steps we followed:

  • We analysed the initial design using Crystal Ball Monte Carlo simulation, assuming that each component is uniformly distributed between its tolerance limits. The results showed unacceptably high variation.
  • The sensitivity chart identified the biggest cause of variation, so we replaced it with a tighter tolerance part. This reduced variation, but not enough.
  • We tried 0.1% resistors, which further improved quality, but at unacceptable parts cost.
  • We recognized that the transfer function depends heavily on the ratio R1/R2. Instead of discrete 0.1% resistors, we used a resistor network with controlled ratio. This reduced parts cost to acceptable levels, but variation was still too high.
  • Again, the sensitivity chart identified the biggest cause of variation. We gathered a sample of parts and measured them, using actual data instead of the default assumption. This change brought the predicted quality to an acceptable level..
New product design is always iterative. To introduce products more quickly, these iterations must be done rapidly, in the analysis phase. Later, in the prototype phase, revisions are slow and costly. This case study illustrates how a design may be fully optimised before building a single prototype. Tolerance Design and Monte Carlo simulation are the keys to a safe, robust and successful new product.

Andy Sleeper
Successful Statistics LLC
970-420-0243
andy@OQPD.com

About Decisioneering, Inc.
Founded in 1986, Denver-based Decisioneering, Inc., is a privately-held company that provides software, training and consulting services that simplify risk analysis and improve decision-making. The company's offerings include Crystal Ball, the industry-leading risk analysis package.

For more information about the company, call 800-289-2550 or visit Decisioneering's Web site, www.crystalball.com.

For more information about the Crystal Ball User Conference, please visit www.crystalball.com/cbuc.

CONTACTS: Marketing Group Decisioneering, Inc.
1-303-534-1515
press@crystalball.com



© onesixsigma.com 2003-2008. All rights reserved.