Posts about Design of Experiments

Ackoff agile management aim airlines Amazon archive Asia ASQ Influential Voices awards bad customer service bad management bad service basketball Bezos Bill Hunter blogs bonus Books Brian Joiner build capacity business Canada Career Carnival cars change charity charts checklist China Clayton Christensen coaching commentary commissions communication complexity continual improvement control chart cool Creativity culture curiouscat Curious Cat Links Customer focus customer service Data Deming Deming Prize demotivate Design of Experiments disruptive innovation Douglas McGregor economic data Economics economy Education engineering entrepreneur ethics Europe evidence based management executive pay experiments extrinsic motivation fear feedback France Fun gemba George Box Germany global Google government guest post Health care health care system hiring in-process measures India Innovation inspection inspiration internet intrinsic motivation Investing IT Japan jobs Joel Spolsky John Hunter Joy in Work kanban layoffs leadership lean lean healthcare lean management lean manufacturing lean six sigma Lean thinking learning lecture long term thinking Madison management Management Management Articles management consulting management experts management history management research management tools management webcast managers managing people Manufacturing marketing meetings Mexico motivation online resources open access open source overpaid executives Paul Graham PDSA Performance Appraisal Peter Scholtes photos poka yoke Poppendieck Popular prediction previous posts problem solving Process improvement productivity program management programming project management Psychology Public Sector purpose quality quality management experts Quality tools quota quote regulation research Respect respect for people retail Ruby Science seminar short term thinking simplicity Singapore Six sigma Software Development South America SPC stakeholders standardization Statistics Statistics for Experimenters stockholders suppliers Systems thinking system thinking tags Taiichi Ohno targets teams TED Tesco Thailand Theory of Constraints theory of knowledge tips Toyota Toyota Production System (TPS) TQM training travel Travel photos UK uncategorized USA usability variation visual communication visual instructions visual management visual work instructions Warren Buffett waste webcast webcasts William Hunter Wisconsin Womack work workplace improvement

One factor at a time (OFAT) Versus Factorial Designs

Guest post by Bradley Jones

Almost a hundred years ago R. A. Fisher‘s boss published an article espousing OFAT (one factor at a time). Fisher responded with an article of his own laying out his justification for factorial design. I admire the courage it took to contradict his boss in print!

Fisher’s argument was mainly about efficiency – that you could learn as much about many factors as you learned about one in the same number of trials. Saving money and effort is a powerful and positive motivator.

The most common argument I read against OFAT these days has to do with inability to detect interactions and the possibility of finding suboptimal factor settings at the end of the investigation. I admit to using these arguments myself in print.

I don’t think these arguments are as effective as Fisher’s original argument.

To play the devil’s advocate for a moment consider this thought experiment. You have to climb a hill that runs on a line going from southwest to northeast but you are only allowed to make steps that are due north or south or due east or west. Though you will have to make many zig zags you will eventually make it to the top. If you noted your altitude at each step, you would have enough data to fit a response surface.

Obviously this approach is very inefficient but it is not impossible. Don’t mistake my intent here. I am definitely not an advocate of OFAT. Rather I would like to find more convincing arguments to persuade experimenters to move to multi-factor design.

Related: The Purpose of Factorial Designed ExperimentsUsing Design of Experimentsarticles by R.A. Fisherarticles on using factorial design of experimentsDoes good experimental design require changing only one factor at a time (OFAT)?Statistics for Experimenters

Factorial Designed Experiment Aim

Multivariate experiments are a very powerful management tool to learn and improve performance. Experiments in general, and designed factorial experiments in particular, are dramatically underused by managers. A question on LinkedIn asks?

When doing a DOE we select factors with levels to induce purposely changes in the response variable. Do we want the response variable to move within the specs of the customers? Or it doesn’t matter since we are learning about the process?

The aim needs to consider what you are trying to learn, costs and potential rewards. Weighing the various factors will determine if you want to aim to keep results within specification or can try options that are likely to return results that are outside of specs.

If the effort was looking for breakthrough improvement and costs of running experiments that might produce results outside of spec were low then specs wouldn’t matter much. If the costs of running experiments are very high (compared with expectations of results) then you may well want to try designed experiment values that you anticipate will still produce results within specs.

There are various ways costs come into play. Here I am mainly looking at the costs as (costs – revenue). For example the case where if the results are withing spec and can be used the costs (net costs, including revenue) of the experiment run are substantially lower.
Continue reading

Combinatorial Testing – The Quadrant of Massive Efficiency Gains

My brother, Justin Hunter, gives a lightning talk on Combinatorial Testing – The Quadrant of Doom and The Quadrant of Massive Efficiency Gains in the video above. The following text is largely directly quoted from the talk – with a bit of editing by me.

When you have a situation that has many many many possible parameters and each time only a few possible choices (a few items you are trying to vary and test – in his example in the video, 2 choices) you wind up with a ridicules number of possible tests. But you can cover all the possibilities in just 30 tests if your coverage target is all possible pairs. When you have situations like that you will see dramatic efficiency gains. What we have found in real world tests is greatly reduced time to create the tests and consistently 2 to 3 times as many defects found compared to the standard methods used for software testing.

You can read more on these ideas on his blog, where he explores software testing and combinatorial testing. The web base software testing application my brother created and shows in the demo is Hexawise. It is free to try out. I recommend it, though I am biased.

Related: Combinatorial Testing for SoftwareVideo Highlight Reel of Hexawise – a pairwise testing tool and combinatorial testing toolYouTube Uses Multivariate Experiment To Improve Sign-ups 15%What Else Can Software Development and Testing Learn from Manufacturing? Don’t Forget Design of Experiments (DoE)

Justin posted the presentation slides online at for anyone who is interested in seeing more details about the test plan he reviewed that had 1,746,756,896,558,880,852,541,440 possible tests. The slides are well worth reading.
Continue reading

Highlights from Recent George Box Speech

The JMP blog has posted some highlights from George Box’s presentation at Discovery 2009

Infusing his entire presentation with humor and fascinating tales of his memories, Box focused on sequential design of experiments. He attributed much of what he knows about DOE [design of experiments] to Ronald A. Fisher. Box explained that Fisher couldn’t find the things he was looking for in his data, “and he was right. Even if he had had the fastest available computer, he’d still be right,” said Box. Therefore, Fisher figured out how to study a number of factors at one time. And so, the beginnings of DOE.

Having worked and studied with many other famous statisticians and analytic thinkers, Box did not hesitate to share his characterizations of them. He told a story about Dr. Bill Hunter and how he required his students to run an experiment. Apparently a variety of subjects was studied [see 101 Ways to Design an Experiment, or Some Ideas About Teaching Design of Experiments]

According to Box, the difficulty of getting DOE to take root lies in the fact that these mathematicians “can’t really get the fact that it’s not about proving a theorem, it’s about being curious about things. There aren’t enough people who will apply [DOE] as a way of finding things out. But maybe with JMP, things will change that way.”

George Box is a great mind and great person who I have had the privilege of knowing my whole life. My father took his class at Princeton, then followed George to the University of Wisconsin-Madison (where Dr. Box founded the statistics department and Dad received the first PhD). They worked together building the UW statistics department, writing Statistics for Experimenters and founding the Center for Quality and Productivity Improvement among many other things.

Statistics for Experimenters: Design, Innovation, and Discovery shows that the goal of design of experiments is to learn and refine your experiment based on the knowledge you gain and experiment again. It is a process of discovery. If done properly it is very similar to the PDSA cycle with the application of statistical tools to aid in determining the impact of various factors under study.

Related: Box on QualityGeorge Box Quotationsposts on design of experimentsUsing Design of Experiments

YouTube Uses Multivariate Experiment To Improve Sign-ups 15%

Google does a great job of using statistical and engineering principles to improve. It is amazing how slow we are to adopt new ideas but because we are it provides big advantages to companies like Google that use concepts like design of experiments, experimenting quickly and often… while others don’t. Look Inside a 1,024 Recipe Multivariate Experiment

A few weeks ago, we ran one of the largest multivariate experiments ever: a 1,024 recipe experiment on 100% of our US-English homepage. Utilizing Google Website Optimizer, we made small changes to three sections on our homepage (see below), with the goal of increasing the number of people who signed up for an account. The results were impressive: the new page performed 15.7% better than the original, resulting in thousands more sign-ups and personalized views to the homepage every day.

While we could have hypothesized which elements result in greater conversions (for example, the color red is more eye-catching), multivariate testing reveals and proves the combinatorial impact of different configurations. Running tests like this also help guide our design process: instead of relying on our own ideas and intuition, you have a big part in steering us in the right direction. In fact, we plan on incorporating many of these elements in future evolutions of our homepage.

via: @hexawiseMy brother has created a software application to provide much better test coverage with far fewer tests using the same factorial designed experiments ideas my father worked with decades ago (and yet still far to few people use).

Related: Combinatorial Testing for SoftwareStatistics for ExperimentersGoogle’s Website Optimizer allows for multivariate testing of your website.Using Design of Experiments

Combinatorial Testing for Software

Combinatorial testing of software is very similar to the design of experiments work my father was involved in, and which I have a special interest in. Combinatorial testing looks at binary interaction effects (success or failure), since it is seeking to find bugs in software, while design of experiments captures the magnitude of interaction effects on performance. In the last several years my brother, Justin Hunter, has been working on using combinatorial testing to improve software development practices. He visited me this week and we discussed the potential value of increasing the adoption of combinatorial testing, which is similar to the value of increasing the adoption of the use of design of experiments: both offer great opportunities for large improvements in current practices.

Automated Combinatorial Testing for Software

Software developers frequently encounter failures that occur only as the result of an interaction between two components. Testers often use pairwise testing – all pairs of parameter values – to detect such interactions. Combinatorial testing beyond pairwise is rarely used because good algorithms for higher strength combinations (e.g., 4-way or more) have not been available, but empirical evidence shows that some errors are triggered only by the interaction of three, four, or more parameters

Practical Combinatorial Testing: Beyond Pairwise by Rick Kuhn, US National Institute of Standards and Technology; Yu Lei, University of Texas, Arlington; and Raghu Kacker, US National Institute of Standards and Technology.

the detection rate increased rapidly with interaction strength. Within the NASA database application, for example, 67 percent of the failures were triggered by only a single parameter value, 93 percent by two-way combinations, and 98 percent by three-way combinations.2 The detection-rate curves for the other applications studied are similar, reaching 100 percent detection with four- to six-way interactions.
These results are not conclusive, but they suggest that the degree of interaction involved in faults is relatively low, even though pairwise testing is insufficient. Testing all four- to six-way combinations might therefore provide reasonably high assurance.

Related: Future Directions for Agile ManagementThe Defect Black MarketMetrics and Software DevelopmentFull and Fractional Factorial Test DesignGoogle Website Optimizer

Statistics for Experimenters in Spanish

book cover of Estadística para Investigadores

Statistics for Experimenters, second edition, by George E. P. Box, J. Stuart Hunter and William G. Hunter (my father) is now available in Spanish.

Read a bit more can find a bit more on the Spanish edition, in Spanish. Estadística para Investigadores Diseño, innovación y descubrimiento Segunda edición.

Statistics for Experimenters – Second Edition:

Catalyzing innovation, problem solving, and discovery, the Second Edition provides experimenters with the scientific and statistical tools needed to maximize the knowledge gained from research data, illustrating how these tools may best be utilized during all stages of the investigative process. The authors’ practical approach starts with a problem that needs to be solved and then examines the appropriate statistical methods of design and analysis.

* Graphical Analysis of Variance
* Computer Analysis of Complex Designs
* Simplification by transformation
* Hands-on experimentation using Response Service Methods
* Further development of robust product and process design using split plot arrangements and minimization of error transmission
* Introduction to Process Control, Forecasting and Time Series

Book available via Editorial Reverte

Related: Statistics for Experimenters ReviewCorrelation is Not CausationStatistics for Experimenters Dataposts on design of experiments

Full and Fractional Factorial Test Design

An Essential Primer on Full and Fractional Factorial Test Design

Since full factorial gathers additional data, it reveals all possible interactions, but as seen by the numbers above, there is a trade-off. More data equals more information but more data also equals a longer test duration. The minimum data requirements for full factorial are very high since you are showing every experiment.

Even if you are using full factorial to get the same amount of information as a fractional factorial test, it will take more time since you need more data to see statistically relevant differences between the many experiments. You might be wondering how fractional factorial can be accurate if interactions are possible?

Random interactions of high relevance are very rare, especially when looking for interactions of more than 2 factors. You really need to design tests where you look for meaningful interactions that are based on true business requirements rather than hoping for a random and low influence interaction between a red button, a hero shot and a headline.

I am a fan of design of experiments as long time readers know (see posts on design of experiments).

Some good resources for more on the topics discussed above: What Can You Find Out From 8 and 16 Experimental Runs? by George Box – Statistics for ExperimentersDesign of Experiments in Advertising.

Related: Google Website Optimizerfactorial experiment articlesUsing Design of ExperimentsMarketers Are Embracing Statistical Design of Experiments

Printer Product Development Using Design of Experiments

MEMS development in less than half the time by Christopher N. Delametter, Eastman Kodak Company

The traditional approach to optimizing a product or process using computer simulation is to evaluate the effects of one design parameter at a time. The problem with this approach is that interactions between design factors and second-order effects are likely to result in a locally optimized design that will provide far less performance than the global optimum. Kodak researchers use DOE to develop tests that examine first-order, second-order, and multiple factor effects simultaneously with relatively few simulation runs. The result is that the analyst can iterate to a globally optimized design with a far higher level of certainty and in much less time than the traditional approach.

By using DOE to drive CFD, Kodak researchers were able to optimize the design of the printhead in considerably less time than competitors. The advantages of simulation were especially apparent late in the project when researchers discovered a more optimal ink formulation for one of the colors.

Related: Design of Experiments articlesUsing Design of ExperimentsStatistics for ExperimentersWhy Use Designed Factorial Experiments?Kodak Debuts Printers With Inexpensive Cartridges

Management Improvement Carnival #34

Please submit your favorite management posts to the carnival. Read the previous management carnivals.

  • Introduction to Factorial Designs by Jonathan Mendez – “I like the idea of velocity in marketing — test, learn, test, learn, test. Instead of one large test I prefer focusing attention on certain areas or elements to achieve deeper understanding.”
  • MIT’s Message about Lean Enterprise Transformation by Mark Edmondson- “1. Market leaders are good at embracing enterprise change; 2. Enterprise change requires a holistic approach that engages all stakeholders. This includes employees, suppliers, customers, unions, and investors/owners”
  • Two Types of Bottleneck by David J. Anderson – “I now teach that there are two types of bottleneck: capacity constrained resources CCRs; and non-instant availability resources”
  • Oranges, Pebbles, and Sand by Ron Pereira – “In this video my daughters and I demonstrate how meeting an objective is just the beginning to improvement.”
  • Why errorproof when you can double-check? – “If you are in the position to prevent the error in the first place, why wouldn’t you? And, I’d argue, if you can write a tool to detect the screw up – ie, it is possible to programmatically figure out that the template is wrong,”
  • Systems and Improvement by John Dowd – “Thus did Deming, over sixty years ago, show a basic model about how to think about quality and improvement.”
  • Continue reading

Box on Quality

Bill Hunter and George Box

Dr. George Box is not as well known in the general management community as his ideas merit (in my biased opinion – photo of Bill Hunter and George Box). He is well know in the statistics field as one of the leading statistical minds. Box on Quality is an excellent book that gathers his essays from his 65th to 80th year. The book has just been issued in paperback (which helps as the hardback was pricey).

While some of the essays are aimed at a reader with an advanced understanding of statistics, many of the articles are aimed at any manager attempting to apply Quality Management principles (SPC, Deming, process improvement, six sigma, etc.). An except from the book provides a table of contents and an introduction.

Some of the articles from the book are available online. I encourage you to take a look at several of the articles and then go ahead and add this book to your prized management resources, if you find them worthwhile.

Marketers Are Embracing Statistical Design of Experiments

Marketers Are Embracing Statistical Design of Experiments (site broke link so I removed it) by Richard Burnham.

Crayola® conducts an e-mail marketing DOE to attract parents and teachers to their new Internet site. The company discovers a combination of factors that makes their new e-mail pitch three-and-a-half times more effective than the control. (Harvard Business Review, October 2001, “Boost Your Marketing ROI with Experimental Design,” Almquist, Wyner.)

Marketers can’t always be certain what triggers buyers to respond. In the past, we were always admonished to test-test-test, but only one factor at a time – relying on our gut feelings and uncertain hopes. With DOE, marketers have replaced voodoo with the science of statistics.

For more on Design of Experiments see:

Statistics for Experimenters – Second Edition

Buy Statistics for Experimenters

The classic Statistics for Experimenters has been updated by George Box and Stu Hunter, two of the three original authors. Bill Hunter, who was my father, and the other author, died in 1986. Order online: Statistics for Experimenters: Design, Innovation, and Discovery , 2nd Edition by George E. P. Box, J. Stuart Hunter, William G. Hunter.
I happen to agree with those who call this book a classic, however, I am obviously biased.

Google Scholar citations for the first edition of Statistics for Experimenters.
Citations in Cite Seer to the first edition.

The first edition includes the text of Experiment by Cole Porter. In 1978 finding a recording of this song was next to impossible. Now Experiment can be heard on the De-Lovely soundtrack.

Text from the publisher on the 2nd Edition:
Rewritten and updated, this new edition of Statistics for Experimenters adopts the same approaches as the landmark First Edition by teaching with examples, readily understood graphics, and the appropriate use of computers. Catalyzing innovation, problem solving, and discovery, the Second Edition provides experimenters with the scientific and statistical tools needed to maximize the knowledge gained from research data, illustrating how these tools may best be utilized during all stages of the investigative process. The authors’ practical approach starts with a problem that needs to be solved and then examines the appropriate statistical methods of design and analysis.
Continue reading

Management Improvement History

Originally posted to the Deming Electronic Network, 22 Sep 1999, in reponse to this message

I would like to say that I think it is good that we have disagreements on the DEN. I think it is a strength of the DEN, not a weakness. However, I think we sometimes get to personal with no real purpose. One example of this, for me, is: “Well, I guess we knew different Demings. Mine was a teacher named Dr. W. Edwards Deming.” I doubt this statement is meant to be taken literally, and if it is not I do not see what it adds to the discussion. I point this out not because I think this is some bad act that should be punished but that I think we need to continue to develop a sense of how we wish to express our disagreements and I think that we should try to do so more constructively.

For the past 60 years we’ve been looking for the magic bullet that will improve the quality of our products, services and lives. In the 1940s, we applied statistics through sampling, SPC and design of experiments to improve our products. In the 1950s, we used quality cost and total quality control to bring about quality improvement. In the 1960s, zero defects and MIL-Q-9858A drove the quality improvement process. In the 1970s, quality circles, process qualification and supplier qualification became key quality issues. In the 1980s, employee training in problem solving, team activities and just-in-time inventory were the things to do.”

I find this statement so far from the truth that it would seriously damage any PDSA with this as an accepted assessment of history. I do not believe Deming had such an inaccurate view (of course I may be wrong). I do believe we need to improve our practice of Quality (and to do that we need to understand what happened in the past and why it was not more successful). The idea that Design of Experiments (DoE) was at the core of some Quality Movement to me is not at all accurate. Continue reading