Category Archives: Design of Experiments

Profound Podcast: John Hunter – Curious Cat

John Willis interviewed me for his Profound podcast series (listen to part one of the podcast, John Hunter – Curious Cat)

This post provides links to more information on what we discussed in the podcast. Hopefully these links allow you to explore ideas that were mentioned in the podcast and that you would like to learn more about.

We also talked about six sigma a bit on the podcast. While I believe six sigma falls far short of what I think a good management system should encompass I am less negative about six sigma than most Deming folks. I discussed my thoughts in: Deming and Six Sigma. In my opinion the biggest problems people complain about with six sigma efforts are about how poorly it is implemented, which is true for every management system I have seen. I have discussed the idea of poor implementation of management practices previously also: Why Use Lean (or Deming or…) if So Many Fail To Do So Effectively.

I will add another blog post for part two of the interview when I get a chance.

Listen to more interviews with me.

Understanding Design of Experiments (DoE) in Protein Purification

This webcast, from GE Life Sciences, seeks to provide an understanding Design of Experiments (DoE) using an example of protein purification. It begins with a good overview of the reason why multi-factorial experiments must be used while changing multiple factors at the same time in order to see interactions between factors. These interactions are completely missed by one-factor-at-a-time experiments.

While it is a good introduction it might be a bit confusing if you are not familiar with multi-factorial designed experiments. You may want to read some of the links below or take advantage of the ability to pause the video to think about what he says or to replay portions you don’t pick up immediately.

I have discussed the value of design of experiments in multiple posts on this blog in the past, including: Introductory Videos on Using Design of Experiments to Improve Results by Stu Hunter, Design of Experiments: The Process of Discovery is Iterative and Factorial Designed Experiment Aim.

He also provides a good overview of 3 basic aims of multivariate experiment (DoE):

  • screening (to determine which factors have the largest impact on the results that are most important)
  • optimization (optimize the results)
  • robustness testing (determine if there are risks in variations to factors)

Normally an experiment will focus on one of these aims. So you don’t know the most important factors you may choose to do a screening experiment to figure out which factors you want to study in detail in an optimization experiment.

It could be an optimized set of values for factors provides very good results but is not robust. If you don’t have easy way to make sure the factors do not vary it may be worthwhile to choose another option that provides nearly as good results but is much more robust (good results even with more variation within the values of the factors).

Related: YouTube Uses Multivariate Experiment To Improve Sign-ups 15% (2009)Combinatorial Testing for Software (2009)Marketers Are Embracing Statistical Design of Experiments (2005)

Design of Experiments: The Process of Discovery is Iterative

This video is another excerpt on the design of experiments videos by George Box, see previous posts: Introduction to Fractional Factorial Designed Experiments and The Art of Discovery. This video looks at learning about experimental design using paper helicopters (the paper linked there may be of interest to you also).

[the video is no longer available online]

In this example a screening experiment was done first to find those factors that have the largest impact on results. Once the most important factors are determined more care can be put into studying those factors in greater detail.

The video was posted by Wiley (with the permission of George’s family), Wiley is the publisher of George’s recent autobiography, An Accidental Statistician, and many of his other books.

The importance of keeping the scope (in dollars and time) of initial experiments down was emphasized in the video.

George Box: “Always remember the process of discovery is iterative. The results of each stage of investigation generating new questions to answered during the next.”

Soren Bisgaard and Conrad Fung also appear in this except of the video.

The end of the video includes several suggested resources including: Statistics for Experimenters, Out of the Crisis and The Scientific Context of Quality Improvement.

Related: Introductory Videos on Using Design of Experiments to Improve Results (with Stu Hunter)Why Use Designed Factorial Experiments?brainstormingWhat Can You Find Out From 12 Experimental Runs?

George Box

I would most likely not exist if it were not for George Box. My father took a course from George while my father was a student at Princeton. George agreed to start the Statistics Department at the University of Wisconsin – Madison, and my father followed him to Madison, to be the first PhD student. Dad graduated, and the next year was a professor there, where he and George remained for the rest of their careers.

George died today, he was born in 1919. He recently completed An Accidental Statistician: The Life and Memories of George E. P. Box which is an excellent book that captures his great ability to tell stories. It is a wonderful read for anyone interested in statistics and management improvement or just great stories of an interesting life.

photo of George EP Box

George Box by Brent Nicastro.

George Box was a fantastic statistician. I am not the person to judge, but from what I have read one of the handful of most important applied statisticians of the last 100 years. His contributions are enormous. Several well know statistical methods are known by his name, including:

George was elected a member of the American Academy of Arts and Sciences in 1974 and a Fellow of the Royal Society in 1979. He also served as president of the American Statistics Association in 1978. George is also an honorary member of ASQ.

George was a very kind, caring and fun person. He was a gifted storyteller and writer. He had the ability to present ideas so they were easy to comprehend and appreciate. While his writing was great, seeing him in person added so much more. Growing up I was able to enjoy his stories often, at our house or his. The last time I was in Madison, my brother and I visited with him and again listened to his marvelous stories about Carl Pearson, Ronald Fisher and so much more. He was one those special people that made you very happy whenever you were near him.

George Box, Stuart Hunter and Bill Hunter (my father) wrote what has become a classic text for experimenters in scientific and business circles, Statistics for Experimenters. I am biased but I think this is acknowledged as one of (if not the) most important books on design of experiments.

George also wrote other classic books: Time series analysis: Forecasting and control (1979, with Gwilym Jenkins) and Bayesian inference in statistical analysis. (1973, with George C. Tiao).

George Box and Bill Hunter co-founded the Center for Quality and Productivity Improvement at the University of Wisconsin-Madison in 1984. The Center develops, advances and communicates quality improvement methods and ideas.

The Box Medal for Outstanding Contributions to Industrial Statistics recognizes development and the application of statistical methods in European business and industry in his honor.

All models are wrong but some are useful” is likely his most famous quote. More quotes By George Box

A few selected articles and reports by George Box

Continue reading

Introductory Videos on Using Design of Experiments to Improve Results

The video shows Stu Hunter discussing design of experiments in 1966. It might be a bit slow going at first but the full set of videos really does give you a quick overview of the many important aspects of design of experiments including factorial designed experiments, fractional factorial design, blocking and response surface design. It really is quite good, if you find the start too slow for you skip down to the second video and watch it.

My guess is, for those unfamiliar with even the most cursory understanding of design of experiments, the discussion may start moving faster than you can absorb the information. One of the great things about video is you can just pause and give yourself a chance to catch up or repeat a part that you didn’t quite understand. You can also take a look at articles on design of experiments.

I believe design of experiments is an extremely powerful methodology of improvement that is greatly underutilized. Six sigma is the only management improvement program that emphasizes factorial designed experiments.

Related: One factor at a time (OFAT) Versus Factorial DesignsThe purpose of Factorial Designed Experiments

Continue reading

One factor at a time (OFAT) Versus Factorial Designs

Guest post by Bradley Jones

Almost a hundred years ago R. A. Fisher‘s boss published an article espousing OFAT (one factor at a time). Fisher responded with an article of his own laying out his justification for factorial design. I admire the courage it took to contradict his boss in print!

Fisher’s argument was mainly about efficiency – that you could learn as much about many factors as you learned about one in the same number of trials. Saving money and effort is a powerful and positive motivator.

The most common argument I read against OFAT these days has to do with inability to detect interactions and the possibility of finding suboptimal factor settings at the end of the investigation. I admit to using these arguments myself in print.

I don’t think these arguments are as effective as Fisher’s original argument.

To play the devil’s advocate for a moment consider this thought experiment. You have to climb a hill that runs on a line going from southwest to northeast but you are only allowed to make steps that are due north or south or due east or west. Though you will have to make many zig zags you will eventually make it to the top. If you noted your altitude at each step, you would have enough data to fit a response surface.

Obviously this approach is very inefficient but it is not impossible. Don’t mistake my intent here. I am definitely not an advocate of OFAT. Rather I would like to find more convincing arguments to persuade experimenters to move to multi-factor design.

Related: The Purpose of Factorial Designed ExperimentsUsing Design of Experimentsarticles by R.A. Fisherarticles on using factorial design of experimentsDoes good experimental design require changing only one factor at a time (OFAT)?Statistics for Experimenters

Factorial Designed Experiment Aim

Multivariate experiments are a very powerful management tool to learn and improve performance. Experiments in general, and designed factorial experiments in particular, are dramatically underused by managers. A question on LinkedIn asks?

When doing a DOE we select factors with levels to induce purposely changes in the response variable. Do we want the response variable to move within the specs of the customers? Or it doesn’t matter since we are learning about the process?

The aim needs to consider what you are trying to learn, costs and potential rewards. Weighing the various factors will determine if you want to aim to keep results within specification or can try options that are likely to return results that are outside of specs.

If the effort was looking for breakthrough improvement and costs of running experiments that might produce results outside of spec were low then specs wouldn’t matter much. If the costs of running experiments are very high (compared with expectations of results) then you may well want to try designed experiment values that you anticipate will still produce results within specs.

There are various ways costs come into play. Here I am mainly looking at the costs as (costs – revenue). For example the case where if the results are withing spec and can be used the costs (net costs, including revenue) of the experiment run are substantially lower.
Continue reading

Highlights from Recent George Box Speech

The JMP blog has posted some highlights from George Box’s presentation at Discovery 2009 [the broken link was removed]

Infusing his entire presentation with humor and fascinating tales of his memories, Box focused on sequential design of experiments. He attributed much of what he knows about DOE [design of experiments] to Ronald A. Fisher. Box explained that Fisher couldn’t find the things he was looking for in his data, “and he was right. Even if he had had the fastest available computer, he’d still be right,” said Box. Therefore, Fisher figured out how to study a number of factors at one time. And so, the beginnings of DOE.

Having worked and studied with many other famous statisticians and analytic thinkers, Box did not hesitate to share his characterizations of them. He told a story about Dr. Bill Hunter and how he required his students to run an experiment. Apparently a variety of subjects was studied [see 101 Ways to Design an Experiment, or Some Ideas About Teaching Design of Experiments]

According to Box, the difficulty of getting DOE to take root lies in the fact that these mathematicians “can’t really get the fact that it’s not about proving a theorem, it’s about being curious about things. There aren’t enough people who will apply [DOE] as a way of finding things out. But maybe with JMP, things will change that way.”

George Box is a great mind and great person who I have had the privilege of knowing my whole life. My father took his class at Princeton, then followed George to the University of Wisconsin-Madison (where Dr. Box founded the statistics department and Dad received the first PhD). They worked together building the UW statistics department, writing Statistics for Experimenters and founding the Center for Quality and Productivity Improvement among many other things.

Statistics for Experimenters: Design, Innovation, and Discovery shows that the goal of design of experiments is to learn and refine your experiment based on the knowledge you gain and experiment again. It is a process of discovery. If done properly it is very similar to the PDSA cycle with the application of statistical tools to aid in determining the impact of various factors under study.

Related: Box on QualityGeorge Box Quotationsposts on design of experimentsUsing Design of Experiments

YouTube Uses Multivariate Experiment To Improve Sign-ups 15%

Google does a great job of using statistical and engineering principles to improve. It is amazing how slow we are to adopt new ideas but because we are it provides big advantages to companies like Google that use concepts like design of experiments, experimenting quickly and often… while others don’t. Look Inside a 1,024 Recipe Multivariate Experiment

A few weeks ago, we ran one of the largest multivariate experiments ever: a 1,024 recipe experiment on 100% of our US-English homepage. Utilizing Google Website Optimizer, we made small changes to three sections on our homepage (see below), with the goal of increasing the number of people who signed up for an account. The results were impressive: the new page performed 15.7% better than the original, resulting in thousands more sign-ups and personalized views to the homepage every day.

While we could have hypothesized which elements result in greater conversions (for example, the color red is more eye-catching), multivariate testing reveals and proves the combinatorial impact of different configurations. Running tests like this also help guide our design process: instead of relying on our own ideas and intuition, you have a big part in steering us in the right direction. In fact, we plan on incorporating many of these elements in future evolutions of our homepage.

via: @hexawiseMy brother has created a software application to provide much better test coverage with far fewer tests using the same factorial designed experiments ideas my father worked with decades ago (and yet still far to few people use).

Related: Combinatorial Testing for SoftwareStatistics for ExperimentersGoogle’s Website Optimizer allows for multivariate testing of your website.Using Design of Experiments

Combinatorial Testing for Software

Combinatorial testing of software is very similar to the design of experiments work my father was involved in, and which I have a special interest in. Combinatorial testing looks at binary interaction effects (success or failure), since it is seeking to find bugs in software, while design of experiments captures the magnitude of interaction effects on performance. In the last several years my brother, Justin Hunter, has been working on using combinatorial testing to improve software development practices. He visited me this week and we discussed the potential value of increasing the adoption of combinatorial testing, which is similar to the value of increasing the adoption of the use of design of experiments: both offer great opportunities for large improvements in current practices.

Automated Combinatorial Testing for Software

Software developers frequently encounter failures that occur only as the result of an interaction between two components. Testers often use pairwise testing – all pairs of parameter values – to detect such interactions. Combinatorial testing beyond pairwise is rarely used because good algorithms for higher strength combinations (e.g., 4-way or more) have not been available, but empirical evidence shows that some errors are triggered only by the interaction of three, four, or more parameters

Practical Combinatorial Testing: Beyond Pairwise by Rick Kuhn, US National Institute of Standards and Technology; Yu Lei, University of Texas, Arlington; and Raghu Kacker, US National Institute of Standards and Technology.

the detection rate increased rapidly with interaction strength. Within the NASA database application, for example, 67 percent of the failures were triggered by only a single parameter value, 93 percent by two-way combinations, and 98 percent by three-way combinations.2 The detection-rate curves for the other applications studied are similar, reaching 100 percent detection with four- to six-way interactions.
These results are not conclusive, but they suggest that the degree of interaction involved in faults is relatively low, even though pairwise testing is insufficient. Testing all four- to six-way combinations might therefore provide reasonably high assurance.

Related: Future Directions for Agile ManagementThe Defect Black MarketMetrics and Software DevelopmentFull and Fractional Factorial Test DesignGoogle Website Optimizer