Posts about Science

One factor at a time (OFAT) Versus Factorial Designs

Guest post by Bradley Jones

Almost a hundred years ago R. A. Fisher‘s boss published an article espousing OFAT (one factor at a time). Fisher responded with an article of his own laying out his justification for factorial design. I admire the courage it took to contradict his boss in print!

Fisher’s argument was mainly about efficiency – that you could learn as much about many factors as you learned about one in the same number of trials. Saving money and effort is a powerful and positive motivator.

The most common argument I read against OFAT these days has to do with inability to detect interactions and the possibility of finding suboptimal factor settings at the end of the investigation. I admit to using these arguments myself in print.

I don’t think these arguments are as effective as Fisher’s original argument.

To play the devil’s advocate for a moment consider this thought experiment. You have to climb a hill that runs on a line going from southwest to northeast but you are only allowed to make steps that are due north or south or due east or west. Though you will have to make many zig zags you will eventually make it to the top. If you noted your altitude at each step, you would have enough data to fit a response surface.

Obviously this approach is very inefficient but it is not impossible. Don’t mistake my intent here. I am definitely not an advocate of OFAT. Rather I would like to find more convincing arguments to persuade experimenters to move to multi-factor design.

Related: The Purpose of Factorial Designed ExperimentsUsing Design of Experimentsarticles by R.A. Fisherarticles on using factorial design of experimentsDoes good experimental design require changing only one factor at a time (OFAT)?Statistics for Experimenters

Problems With Student Evaluations as Measures of Teacher Performance

Dr. Deming was, among other things a professor. He found the evaluation of professors by students an unimportant (and often counterproductive measure) – used in some places for awards and performance appraisal. He said for such a measure to be useful it should survey students 20 years later to see which professors made a difference to the students. Here is an interesting paper that explored some of these ideas. Does Professor Quality Matter? Evidence from Random Assignment of Students to Professors by Scott E. Carrell, University of California, Davis and National Bureau of Economic Research; and James E. West, U.S. Air Force Academy:

our results indicate that professors who excel at promoting contemporaneous student achievement, on average, harm the subsequent performance of their students in more advanced classes. Academic rank, teaching experience, and terminal degree status of professors are negatively correlated with contemporaneous value‐added but positively correlated with follow‐on course value‐added. Hence, students of less experienced instructors who do not possess a doctorate perform significantly better in the contemporaneous course but perform worse in the follow‐on related curriculum.

Student evaluations are positively correlated with contemporaneous professor value‐added and negatively correlated with follow‐on student achievement. That is, students appear to reward higher grades in the introductory course but punish professors who increase deep learning (introductory course professor value‐added in follow‐on courses). Since many U.S. colleges and universities use student evaluations as a measurement of teaching quality for academic promotion and tenure decisions, this latter finding draws into question the value and accuracy of this practice.

These findings have broad implications for how students should be assessed and teacher quality measured.

Related: Applying Lean Tools to University CoursesK-12 Educational ReformImproving Education with Deming’s IdeasLearning, Systems and ImprovementHow We Know What We Know

Statistical Engineering Links Statistical Thinking, Methods and Tools

In Closing the Gap Roger W. Hoerl and Ronald D. Snee lay out a sensible case for focusing on statistical engineering.

We’re not suggesting that society no longer needs research in new statistical techniques for improvement; it does. The balance needed at this time, however, is perhaps 80% for statistics as an engineering discipline and 20% for statistics as a pure science.

True, though I would put the balance more like 95% engineering, 5% science.

There is a good discussion on LinkedIn:

Davis Balestracci: Unfortunately, we snubbed our noses at the Six Sigma movement…and got our lunch eaten. Ron Snee has been developing this message for the last 20 years (I developed it in four years’ worth of monthly columns for Quality Digest from 2005-2008). BUT…as long as people have a computer, color printer, and a package that does trend lines, academic arguments won’t “convert” anybody.

Recently, we’ve lost our way and evolved into developing “better jackhammers to drive tacks”…and pining for the “good ol’ days” when people listened to us (which they were forced to do because they didn’t have computers, and statistical packages were clunky). Folks, we’d better watch it…or we’re moribund

Was there really a good old days when business listened to statisticians? Of course occasionally they did, but “good old days”? Here is a report from 1986 the theme of which seems to me to be basically how to get statisticians listened to by the people that make the important decisions: The Next 25 Years in Statistics, by Bill Hunter and William Hill. Maybe I do the report a disservice with my understanding of the basic message, but it seems to me to be how to make sure the important contributions of applied statisticians actually get applied in organizations. And it discusses how statisticians need to take action to drive adoption of the ideas because currently (1986) they are too marginalized (not listened to when they should be contributing) in most organizations.
Continue reading

Extrinsic Incentives Kill Creativity

If you read this blog, you know I believe extrinsic motivation is a poor strategy. This TED webcast Dan Pink discusses studies showing extrinsic rewards failing. This is a great webcast, definitely worth 20 minutes of your time.

  • “you’ve got an incentive designed to sharpen thinking and accelerate creativity and it does just the opposite. It dulls thinking and blocks creativity… This has been replicated over and over and over again for nearly 40 years. These contingent motivators, if you do this then you get that, work in some circumstances but in a lot of tasks they actually either don’t work or, often, they do harm.”
  • there is a mismatch between what science knows and what business does
  • “This is a fact.”

What does Dan Pink recommend based on the research? Management should focus on providing workplaces where people have autonomy, mastery and purpose to build on intrinsic motivation.

via: Everything You Think about Pay for Performance Could Be Wrong

Related: Righter IncentivizationWhat’s the Value of a Big Bonus?Dangers of Extrinsic MotivationMotivate or Eliminate De-MotivationGreat Marissa Mayer Webcast on Google Innovation

YouTube Uses Multivariate Experiment To Improve Sign-ups 15%

Google does a great job of using statistical and engineering principles to improve. It is amazing how slow we are to adopt new ideas but because we are it provides big advantages to companies like Google that use concepts like design of experiments, experimenting quickly and often… while others don’t. Look Inside a 1,024 Recipe Multivariate Experiment

A few weeks ago, we ran one of the largest multivariate experiments ever: a 1,024 recipe experiment on 100% of our US-English homepage. Utilizing Google Website Optimizer, we made small changes to three sections on our homepage (see below), with the goal of increasing the number of people who signed up for an account. The results were impressive: the new page performed 15.7% better than the original, resulting in thousands more sign-ups and personalized views to the homepage every day.

While we could have hypothesized which elements result in greater conversions (for example, the color red is more eye-catching), multivariate testing reveals and proves the combinatorial impact of different configurations. Running tests like this also help guide our design process: instead of relying on our own ideas and intuition, you have a big part in steering us in the right direction. In fact, we plan on incorporating many of these elements in future evolutions of our homepage.

via: @hexawiseMy brother has created a software application to provide much better test coverage with far fewer tests using the same factorial designed experiments ideas my father worked with decades ago (and yet still far to few people use).

Related: Combinatorial Testing for SoftwareStatistics for ExperimentersGoogle’s Website Optimizer allows for multivariate testing of your website.Using Design of Experiments

Does the Data Deluge Make the Scientific Method Obsolete?

The End of Theory: The Data Deluge Makes the Scientific Method Obsolete by Chris Anderson

“All models are wrong, but some are useful.”

So proclaimed statistician George Box 30 years ago, and he was right. But what choice did we have? Only models, from cosmological equations to theories of human behavior, seemed to be able to consistently, if imperfectly, explain the world around us. Until now. Today companies like Google, which have grown up in an era of massively abundant data, don’t have to settle for wrong models. Indeed, they don’t have to settle for models at all.

Speaking at the O’Reilly Emerging Technology Conference this past March, Peter Norvig, Google’s research director, offered an update to George Box’s maxim: “All models are wrong, and increasingly you can succeed without them.”

There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

see update, below. Norvig was misquoted, he agrees with Box’s maxim

I must say I am not at all convinced that a new method without theory ready to supplant the existing scientific method. Now I can’t find peter Norvig’s exact words online (come on Google – organize all the world’s information for me please). If he said that using massive stores of data to make discoveries in new ways radically changing how we can learn and create useful systems, that I believe. I do enjoy the idea of trying radical new ways of viewing what is possible.

Practice Makes Perfect: How Billions of Examples Lead to Better Models (summary of his talk on the conference web site):

In this talk we will see that a computer might not learn in the same way that a person does, but it can use massive amounts of data to perform selected tasks very well. We will see that a computer can correct spelling mistakes, translate from Arabic to English, and recognize celebrity faces about as well as an average human—and can do it all by learning from examples rather than by relying on programming.

Related: Will the Data Deluge Makes the Scientific Method Obsolete?Pragmatism and Management KnowledgeData Based Decision Making at GoogleSeeing Patterns Where None ExistsManage what you can’t measureData Based BlatheringUnderstanding DataWebcast on Google Innovation
Continue reading

Fairness Matters

Sense of Fairness Affects Outlook, Decisions

Burnout has been long associated with being overworked and underpaid, but psychologists Christina Maslach and Michael Leiter found that these were not the crucial factors. The single biggest difference between employees who suffered burnout and those who did not was the whether they thought that they were being treated unfairly or fairly.

Their research on fairness dovetails with work by other researchers showing that humans care a great deal about how they are being treated relative to others. In many ways, fairness seems to matter more than absolute measures of how well they are faring — people seem willing to endure tough times if they have the sense the burden is being shared equally, but they quickly become resentful if they feel they are being singled out for poor treatment.

If the sum is $100, for example, the first person might offer to give away $25 and keep $75 for himself. If the second person agrees, the money is divided accordingly. But if the second person rejects the deal, neither one gets anything.

If people cared only about absolute rewards, then Person B ought to accept whatever Person A offers, because getting even $1 is better than nothing. But experiments show that many people will reject the deal if they feel the first person is dividing the money unfairly.

Related: Obscene CEO PayRespect for People and Understanding PsychologyWhy Pay Taxes or be HonestThe Illusion of UnderstandingThe Psychology of Too Much Choice

Drug Price Crisis

In 2005 I posted about some of the problems with drug pricing. It is nice to find at least a couple of people at MIT that want to have MIT focus research on the public good instead of private profit. As I have mentioned too many universities now act like they are for-profit drug or research companies. That is wrong. Drug companies can do so, institutions with purported higher purposes should not be driven to place advancing science below profiting the institution.

Solving the drug price crisis

The mounting U.S. drug price crisis can be contained and eventually reversed by separating drug discovery from drug marketing and by establishing a non-profit company to oversee funding for new medicines, according to two MIT experts on the pharmaceutical industry.

Following the utility model, Finkelstein and Temin propose establishing an independent, public, non-profit Drug Development Corporation (DDC), which would act as an intermediary between the two new industry segments — just as the electric grid acts as an intermediary between energy generators and distributors.

The DDC also would serve as a mechanism for prioritizing drugs for development, noted Finkelstein. “It is a two-level program in which scientists and other experts would recommend to decision-makers which kinds of drugs to fund the most. This would insulate development decisions from the political winds,” he said.

I see their idea as one worth trying. Lets see how it works. Their book: Reasonable Rx – Solving the Drug Price Crisis by Stan Finkelstein and Peter Temin

Related: USA Spent $2.1 Trillion on Health Care in 2006Measuring the Health of NationsAntibiotics Too Often Prescribed for Sinus Woes$600 Million for Basic Biomedical Researcharticles on improving the health care system

Statistics for Experimenters – Second Edition

Buy Statistics for Experimenters

The classic Statistics for Experimenters has been updated by George Box and Stu Hunter, two of the three original authors. Bill Hunter, who was my father, and the other author, died in 1986. Order online: Statistics for Experimenters: Design, Innovation, and Discovery , 2nd Edition by George E. P. Box, J. Stuart Hunter, William G. Hunter.
I happen to agree with those who call this book a classic, however, I am obviously biased.

Google Scholar citations for the first edition of Statistics for Experimenters.
Citations in Cite Seer to the first edition.

The first edition includes the text of Experiment by Cole Porter. In 1978 finding a recording of this song was next to impossible. Now Experiment can be heard on the De-Lovely soundtrack.

Text from the publisher on the 2nd Edition:
Rewritten and updated, this new edition of Statistics for Experimenters adopts the same approaches as the landmark First Edition by teaching with examples, readily understood graphics, and the appropriate use of computers. Catalyzing innovation, problem solving, and discovery, the Second Edition provides experimenters with the scientific and statistical tools needed to maximize the knowledge gained from research data, illustrating how these tools may best be utilized during all stages of the investigative process. The authors’ practical approach starts with a problem that needs to be solved and then examines the appropriate statistical methods of design and analysis.
Continue reading

  • Recent Trackbacks

  • Comments