Understanding Data

Posted on May 9, 2006  Comments (9)

Topic: Management Improvement

Statistics Abuse and Me by Jay Mathews:

the Simpson’s Paradox numbers. The national average for the SAT went up only 4 points between 1981 and 2005, but the average for whites went up 10 points, for blacks 21 points, for Asians 37 points, for Mexicans 15 points, for Puerto Ricans 23 points and for American Indians 18 points.

How can that be? Is it important? First, yes it is important. Effective use of data is an important part of management improvement. Emphasis the effective, not the data. Use of data by itself is not sufficient.

To be effective you need to learn to think about not what is printed on the page but what lies behind the numbers you see. The numbers are just proxies for the real situation. Look beyond the numbers you see to what they mean and understand how the numbers presented may not fully capture the important details you need to consider.

Ok back to how the SAT numbers could seem to go up fairly significantly for all the sub groups of the total but only a little bit for the total. This numerical quirk is known as Simpson’s Paradox. If the proportions of the subgroups (Asians, American Indians…) change then the overall average is effected not just by the changes of the subgroup average SAT scores but the changes of the weighting of each subgroup (so if the overall average is less than even the lowest gain for a subgroup you know that the subgroup weighting must have increased for one or more of the subgroups with lower SAT scores).

Take care when you are make decision based on your understanding of data to avoid assumptions that may not be correct.

9 Responses to “Understanding Data”

  1. Curious Cat Science and Engineering Blog » Report on K-12 Science Education in USA
    July 4th, 2006 @ 6:57 pm

    […] We commented on one example of why it is important to be careful in making conclusions based on data recently (in our management improvement blog). Most often people look for the differences to highlight the differences. That creates a bias to find such differences, which leads me to be a bit skeptical of such claims without an explanation of why the data is convincing that such a difference is significant and not just variation in the data. […]

  2. CuriousCat: Search Share Data - Checking the ACSI
    September 20th, 2007 @ 7:35 pm

    oogle grew 39.8% year over year and Yahoo grew 8.9% year over year. Google now has 53.6% of the total searches…

  3. CuriousCat: Fooled by Randomness
    October 30th, 2007 @ 7:44 pm

    When people are asked to explain random variations in data they will make up special causes (that they often even believe are special causes even when they are not)…

  4. Curious Cat Science and Engineering Blog » 500 Year Floods
    July 13th, 2008 @ 6:50 pm

    It would seem to me, in fact, actually having a 500 year flood actually increases the odds for it happening again (because the data now includes that case which had not been included before)…

  5. Curious Cat Science and Engineering Blog » Mistakes in Experimental Design and Interpretation
    August 29th, 2008 @ 4:13 pm

    We have tendencies that lead us to draw faulty conclusions from data. Given that it is important to understand what common mistakes are made to help us counter the natural tendencies…

  6. Curious Cat Management Improvement Blog » Friday Fun: Correlation
    March 6th, 2009 @ 10:00 am

    “I used to think correlation implied causation.” Now…

  7. Curious Cat Science and Engineering Blog » Bigger Impact: 15 to 18 mpg or 50 to 100 mpg?
    March 14th, 2010 @ 10:56 am

    You can also view 100 mpg as 1/100 gallon per mile, 2/100 gallons per mile, 5.6/100 gpm and 6.7 gpm. That way most everyone sees that the 6.7 to 5.6 gpm saves more fuel than 2 to 1 gpm does…

  8. Curious Cat Management Improvement Blog » Management Blog Posts From May 2006
    May 14th, 2010 @ 10:15 am

    […] Using Data Effectively Requires Thought – The numbers are just proxies for the real situation. Look beyond the numbers you see to what they mean and understand how the numbers presented may not fully capture the important details you need to consider. […]

  9. Is the Results Due to Mathematical Probability or Individual Merit? « The W. Edwards Deming Institute Blog
    August 1st, 2013 @ 9:37 am

    […] Understanding data is important in order to practice evidence based management. Every system has variation when you reward, or blame, people based on how the variation falls when they are around that is likely not a particularly helpful practice. […]

Leave a Reply





  • Recent Trackbacks

  • Comments