## Understanding Data

Posted on May 9, 2006 Comments (9)

Topic: Management Improvement

Statistics Abuse and Me by Jay Mathews:

How can that be? Is it important? First, yes it is important. Effective use of data is an important part of management improvement. Emphasis the effective, not the data. Use of data by itself is not sufficient.

To be effective you need to learn to think about not what is printed on the page but what lies behind the numbers you see. The numbers are just proxies for the real situation. Look beyond the numbers you see to what they mean and understand how the numbers presented may not fully capture the important details you need to consider.

Ok back to how the SAT numbers could seem to go up fairly significantly for all the sub groups of the total but only a little bit for the total. This numerical quirk is known as Simpson’s Paradox. If the proportions of the subgroups (Asians, American Indians…) change then the overall average is effected not just by the changes of the subgroup average SAT scores but the changes of the weighting of each subgroup (so if the overall average is less than even the lowest gain for a subgroup you know that the subgroup weighting must have increased for one or more of the subgroups with lower SAT scores).

Take care when you are make decision based on your understanding of data to avoid assumptions that may not be correct.

- Measurement and Data Collection
- Data Based Decision Making
- Operational Definitions and Data Collection

9 Responses to “Understanding Data”

Leave a Reply

July 4th, 2006 @ 6:57 pm

[…] We commented on one example of why it is important to be careful in making conclusions based on data recently (in our management improvement blog). Most often people look for the differences to highlight the differences. That creates a bias to find such differences, which leads me to be a bit skeptical of such claims without an explanation of why the data is convincing that such a difference is significant and not just variation in the data. […]

September 20th, 2007 @ 7:35 pm

oogle grew 39.8% year over year and Yahoo grew 8.9% year over year. Google now has 53.6% of the total searches…

October 30th, 2007 @ 7:44 pm

When people are asked to explain random variations in data they will make up special causes (that they often even believe are special causes even when they are not)…

July 13th, 2008 @ 6:50 pm

It would seem to me, in fact, actually having a 500 year flood actually increases the odds for it happening again (because the data now includes that case which had not been included before)…

August 29th, 2008 @ 4:13 pm

We have tendencies that lead us to draw faulty conclusions from data. Given that it is important to understand what common mistakes are made to help us counter the natural tendencies…

March 6th, 2009 @ 10:00 am

“I used to think correlation implied causation.” Now…

March 14th, 2010 @ 10:56 am

You can also view 100 mpg as 1/100 gallon per mile, 2/100 gallons per mile, 5.6/100 gpm and 6.7 gpm. That way most everyone sees that the 6.7 to 5.6 gpm saves more fuel than 2 to 1 gpm does…

May 14th, 2010 @ 10:15 am

[…] Using Data Effectively Requires Thought – The numbers are just proxies for the real situation. Look beyond the numbers you see to what they mean and understand how the numbers presented may not fully capture the important details you need to consider. […]

August 1st, 2013 @ 9:37 am

[…] Understanding data is important in order to practice evidence based management. Every system has variation when you reward, or blame, people based on how the variation falls when they are around that is likely not a particularly helpful practice. […]