Posts about Data

Google’s Innovative Use of Economics

Secret of Googlenomics: Data-Fueled Recipe Brews Profitability

Google depends on economic principles to hone what has become the search engine of choice for more than 60 percent of all Internet surfers, and the company uses auction theory to grease the skids of its own operations. All these calculations require an army of math geeks, algorithms of Ramanujanian complexity, and a sales force more comfortable with whiteboard markers than fairway irons.

Varian tried to understand the process better by applying game theory. “I think I was the first person to do that,” he says. After just a few weeks at Google, he went back to Schmidt. “It’s amazing!” Varian said. “You’ve managed to design an auction perfectly.” To Schmidt, who had been at Google barely a year, this was an incredible relief. “Remember, this was when the company had 200 employees and no cash,” he says. “All of a sudden we realized we were in the auction business.”

Google even uses auctions for internal operations, like allocating servers among its various business units. Since moving a product’s storage and computation to a new data center is disruptive, engineers often put it off. “I suggested we run an auction similar to what the airlines do when they oversell a flight. They keep offering bigger vouchers until enough customers give up their seats,” Varian says. “In our case, we offer more machines in exchange for moving to new servers. One group might do it for 50 new ones, another for 100, and another won’t move unless we give them 300. So we give them to the lowest bidder—they get their extra capacity, and we get computation shifted to the new data center.”

Google continues to make bold moves putting faith in their ability to find innovative solutions that others reject as impossible. It is a challenging but interesting path to success, for them, at least.

Related: Google Should Stay True to Their Management PracticesGoogle’s Answer to Filling Jobs Is an AlgorithmThe Google Way: Give Engineers RoomGoogle Website OptimizerGoogle: Experiment Quickly and Oftenposts on innovation in management

Revealed Preference

Revealed Preference: the preference consumers display by their action, in contrast to what they may say they prefer. While surveys may be useful people often say they will do one thing and actually when given the choice to do so, don’t.

Normally what matters is not what people say they want but what they actually will choose. For that reason revealed preference is a better measure than stated preference. Stated preference is often used as a proxy for actual preference (which may be fine) but it is important to understand that it is just a proxy for actual preference.

See more explanations from the Curious Cat Management Dictionary.

Related: Packaging ImprovementAll Models Are Wrong But Some Are UsefulDangers of Forgetting the Proxy Nature of DataConfirmation BiasBe Careful What You Measure

Red Bead Experiment Webcast

Dr. Deming used the red bead experiment to present a view into management practices and his management philosophy. The experiment provides insight into all four aspects of Dr. Deming’s management system: understanding variation, understanding psychology, systems thinking and the theory of knowledge.

Red Bead Experiment by Steve Prevette

Various techniques are used to ensure a quality (no red bead) product. There are quality control inspectors, feedback to the workers, merit pay for superior performance, performance appraisals, procedure compliance, posters and quality programs. The foreman, quality control, and the workers all put forth their best efforts to produce a quality product. The experiment allows the demonstration of the effectiveness (or ineffectiveness) of the various methods.

Related: Fooled by RandomnessPerformance Measures and Statistics CoursePerformance without AppraisalExploring Deming’s Management IdeasEliminate Slogans

How to Create a Control Chart for Seasonal or Trending Data

Lynda Finn, President of Statistical Insight, has written an article on how to create a control chart for seasonal or trending data (where there is an underlying structural variation in the data). Essentially you need to account for the structural variation to create the control limits for the control chart. She also provides a Minitab project file. Both are available for download from the Curious Cat Management Improvement Library.

Related: Control Charts in Health CareCommon Cause VariationManaging with Control ChartsMeasurement and Data CollectionFourth Generation Management

Harvard’s Masters of the Apocalypse

This article makes some good points, even if it is a bit sensationalist, and intentionally so: Harvard’s masters of the apocalypse by Philip Delves Broughton

Business schools have shown a remarkable ability to miss the economic catastrophes unfolding before their eyes.

In the late 1990s, their faculties rushed to write paeans to Enron, the firm of the future, the new economic paradigm. The admiration was mutual: Enron was stuffed with Harvard Business School alumni, from Jeff Skilling, the chief executive, down. When Enron, rotten to the core, collapsed, the old case studies were thrust in a closet and removed from the syllabus, and new ones were promptly written about the ethical and accounting issues posed by Enron’s misadventures.

Is there a pattern here? Go back to the 1980s, and you find that Harvard MBAs played a big enough role in the insider trading scandals that washed through Wall Street for a former chairman of the SEC to consider it a good move to donate millions of dollars for the teaching of ethics at the school.

Time after time, and scandal after scandal, it seems that a school that graduates just 900 students a year finds itself in the thick of it. Yet there is remarkably little contrition.

Last October, Harvard Business School celebrated its 100th birthday with a global summit in Boston. While Wall Street and Washington descended into an economic inferno, Jay Light, the dean of the school and a board member at the Black-stone private equity group, opened the festivities by shrugging off any responsibility.

“We all failed to understand how much [the financial system] had changed in the past 15 years or so, and how fragile it might be because of increased leverage, decreased transparency and decreased liquidity: three of the crucial things in the world of financial markets,” he said.

You can draw up a list of the greatest entrepreneurs of recent history, from Larry Page and Sergey Brin of Google and Bill Gates of Microsoft, to Michael Dell, Richard Branson, Lak-shmi Mittal – and there’s not an MBA between them.

Yet the MBA industry continues to grow, and business schools provide vital income to academic institutions: 500,000 people around the world now graduate each year with an MBA, 150,000 of those in the United States, creating their own management class within global business.

Given the present chaos, shouldn’t we be asking if business education is not just a waste of time, but actually damaging to our economic health?

Business schools unfortunately continue to take a heavily simplistic number (without an understanding of variation) and fad driven approach to management. W. Edwards Deming was against the damage they were causing decades ago, and I see little evidence they have learned from their failures.

Schools are good for making connections and getting a piece of paper. Some companies won’t consider you for some jobs unless you have an document saying you have an MBA. I strongly question the wisdom of only hiring an MBA to do some job. But many companies like to use simple criteria like – without a piece of paper saying you have an MBA we won’t consider you for this job. So if you want a job from them getting that piece of paper is important.

Related: What is Wrong with MBA’sManagement Training ProgramManagement Advice FailuresThe Lean MBA

Friday Fun: Correlation

Correlation doesn't imply causation

From the excellent xkcd comic.

Related: Correlation is Not CausationDoes the Data Deluge Make the Scientific Method Obsolete?Understanding DataTheory of KnowledgeWhat Makes Scientists Different :-) Dangers of Forgetting the Proxy Nature of DataSeeing Patterns Where None Exists

Statistics for Experimenters in Spanish

book cover of Estadística para Investigadores

Statistics for Experimenters, second edition, by George E. P. Box, J. Stuart Hunter and William G. Hunter (my father) is now available in Spanish.

Read a bit more can find a bit more on the Spanish edition, in Spanish. Estadística para Investigadores Diseño, innovación y descubrimiento Segunda edición.

Statistics for Experimenters – Second Edition:

Catalyzing innovation, problem solving, and discovery, the Second Edition provides experimenters with the scientific and statistical tools needed to maximize the knowledge gained from research data, illustrating how these tools may best be utilized during all stages of the investigative process. The authors’ practical approach starts with a problem that needs to be solved and then examines the appropriate statistical methods of design and analysis.

* Graphical Analysis of Variance
* Computer Analysis of Complex Designs
* Simplification by transformation
* Hands-on experimentation using Response Service Methods
* Further development of robust product and process design using split plot arrangements and minimization of error transmission
* Introduction to Process Control, Forecasting and Time Series

Book available via Editorial Reverte

Related: Statistics for Experimenters ReviewCorrelation is Not CausationStatistics for Experimenters Dataposts on design of experiments

What’s the Value of a Big Bonus?

What’s the Value of a Big Bonus? by Dan Ariely

To look at this question, three colleagues and I conducted an experiment. We presented 87 participants with an array of tasks that demanded attention, memory, concentration and creativity. We asked them, for instance, to fit pieces of metal puzzle into a plastic frame, to play a memory game that required them to reproduce a string of numbers and to throw tennis balls at a target. We promised them payment if they performed the tasks exceptionally well. About a third of the subjects were told they’d be given a small bonus, another third were promised a medium-level bonus, and the last third could earn a high bonus.

So it turns out that social pressure has the same effect that money has. It motivates people, especially when the tasks at hand require only effort and no skill. But it can provide stress, too, and at some point that stress overwhelms the motivating influence.

When I recently presented these results to a group of banking executives, they assured me that their own work and that of their employees would not follow this pattern. (I pointed out that with the right research budget, and their participation, we could examine this assertion. They weren’t that interested.)

This is an interesting look at an effect of bonuses. We all know monetary bonuses can influence behavior. The problem is the type of behaviors that result. Huge bonuses, for example, create huge incentives to risk the future of the company for the chance at a huge bonus for the executive. Extrinsic motivation leads to many problems.

Problems with bonuses: Losses Covered Up to Protect Bonuses“Pay for Performance” is a Bad IdeaProblems with BonusesBook: Punished By Rewards: The Trouble With Gold Stars, Incentive Plans, A’s, Praise, and Other Bribes by Alfie Kohn – posts on executive pay

Easiest Countries for Doing Business 2008

Singapore is again ranked first for Ease of Doing Business by the World Bank. For some reason they call the report issued in any given year as the report for the next year (which makes no sense to me). The data shown below is for the year they released the report.

Country 2008 2007 2006 2005
Singapore 1 1 1 2
New Zealand 2 2 2 1
United States 3 3 3 3
Hong Kong 4 4 5 6
Denmark 5 5 7 7
United Kingdom 6 6 6 5
other countries of interest
Canada 8 7 4 4
Japan 12 12 11 12
Germany 25 20 21 21

The rankings include ranking of various aspects of running a business. Some rankings for 2008: Dealing with Construction Permits (Singapore and New Zealand 2nd, USA 26th, China 176th), Employing Workers (Singapore and the USA 1st, Germany 142nd), protecting investors (New Zealand 1st, Singapore 2nd, Hong Kong 3rd, Malaysia 4th, USA 5th), enforcing contracts (Singapore 1st, Hong Kong 2nd, USA 6th, China 18th), getting credit (Malaysia 1st; UK and Hong Kong 2nd; Singapore, New Zealand and USA 5th), paying taxes (Hong Kong 3rd, USA 46th, Japan 112th, China 132nd).

These rankings are not the final word on exactly where each country truly ranks but they do provide a interesting view. With this type of data there is plenty of room for judgment and issues with the data. Several of my posts, from my other blogs, that I recommend on this topic: The Future is Engineering, Science and Engineering in Global Economics and Intellectual Property Rights and Innovation.

Related: Easiest Countries from Which to Operate Businesses 2007Countries Which are Easiest for Doing Business 2006New Look American ManufacturingTop Manufacturing Countries (2007)Oil Consumption by CountryInternational Health Care System PerformanceEconomics, America and China

Does the Data Deluge Make the Scientific Method Obsolete?

The End of Theory: The Data Deluge Makes the Scientific Method Obsolete by Chris Anderson

“All models are wrong, but some are useful.”

So proclaimed statistician George Box 30 years ago, and he was right. But what choice did we have? Only models, from cosmological equations to theories of human behavior, seemed to be able to consistently, if imperfectly, explain the world around us. Until now. Today companies like Google, which have grown up in an era of massively abundant data, don’t have to settle for wrong models. Indeed, they don’t have to settle for models at all.

Speaking at the O’Reilly Emerging Technology Conference this past March, Peter Norvig, Google’s research director, offered an update to George Box’s maxim: “All models are wrong, and increasingly you can succeed without them.”

There is now a better way. Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.

see update, below. Norvig was misquoted, he agrees with Box’s maxim

I must say I am not at all convinced that a new method without theory ready to supplant the existing scientific method. Now I can’t find peter Norvig’s exact words online (come on Google – organize all the world’s information for me please). If he said that using massive stores of data to make discoveries in new ways radically changing how we can learn and create useful systems, that I believe. I do enjoy the idea of trying radical new ways of viewing what is possible.

Practice Makes Perfect: How Billions of Examples Lead to Better Models (summary of his talk on the conference web site):

In this talk we will see that a computer might not learn in the same way that a person does, but it can use massive amounts of data to perform selected tasks very well. We will see that a computer can correct spelling mistakes, translate from Arabic to English, and recognize celebrity faces about as well as an average human—and can do it all by learning from examples rather than by relying on programming.

Related: Will the Data Deluge Makes the Scientific Method Obsolete?Pragmatism and Management KnowledgeData Based Decision Making at GoogleSeeing Patterns Where None ExistsManage what you can’t measureData Based BlatheringUnderstanding DataWebcast on Google Innovation
Continue reading

Management Blog Posts From September 2005

photo of North Cascades National Park

Here are some posts from the blog 3 years ago, this month. I took the photo on my visit to North Cascades National Park.

I have added a page to my personal web site with links to my pages on social web sites: LinkedIn, Reddit, Kiva…).

Hiring the Right Person

Malcolm Gladwell presented at the New Yorker conference on the Challenge of Hiring in the Modern World. As usually, he provides some great thoughts. I wrote on Hiring the Right Workers

The job market is an inefficient market. There are many reasons for this including relying on specification (this job requires a BS in Computer Science – no Bill Gates you don’t meet the spec) instead of understanding the system. Insisting on managing by the numbers even when the most important figures are unknown and maybe unknowable. Using HR to find the right person to work in a process they don’t understand (which reinforces the desire to focus on specifications instead of a more nuanced approach). The inflexibility of companies: so if a great person wants to work 32 hours a week – too bad we can’t hire them. And on and on.

Malcolm Gladwell doesn’t use the same language but I think he says many of the same ideas: “Insisting on managing by the numbers even when the most important figures are unknown and maybe unknowable.” etc. This idea he frames as a mismatch problem.

Related: Hiring: Silicon Valley StylePeople are Our Most Important AssetMalcolm Gladwell SynchronicityHiring, Does College Matter?Interviewing and Hiring ProgrammersGladwell (and Drucker) on Pensions

Outcome and In-Process Measures

An outcome measure is used to measure the success of a system. For example, the outcome measure could be the percentage of people who do not get polio (the result). An output measure, for example, would be the number of people vaccinated with the polio vaccine (the output). Often we measure inputs (amount of money spent) or outputs (number of people vaccinated). They are usually easy to measure but obviously less valuable proxies for what the objective of the system (reducing the incidence of polio).

You should have all these types of measures but outcome measures are most likely to be missing so special care should be taken to make sure you are using them. It is important to define good outcome measures to use in determining the success of systems, and in determining the whether improvement projects actually result in improved outcomes.

In-process measures can be valuable in providing actionable information sooner than the outcome measure would allow action. In the polio example, an in process measure example could be % of vaccination by the time a babies is 18 months old. And looking across a country say it might well make sense to stratify the data to see if certain areas were doing poorly on this measure. If so that might be where to focus improvement. You don’t need to wait until people not vaccinated start contracting polio (which will likely be delayed for years after the system starts to have processes fail, in this example) to then notice the problem and then react.

Waiting for the outcome measure to point to a problem in this case (and in many cases) is far too late for process improvement. So process measures are needed to aid in managing the system and reacting to process results, before those processes create poor results (and can be seen as poor outcome measures). More on outcome measures.

Related: Operational Definitiontamperingmanagement improvement web searchMeasuring and Managing Performance in OrganizationsData is a Proxyposts on managing using data

Post Number 1,000

This is the 1,000th post to the Curious Cat Management Improvement Blog. Here are some highlights:

Making a Difference

Kiva provides loans through partners (operating in the countries) to the entrepreneurs. Those partners do charge the entrepreneurs interest (to fund the operations of the lending partner). Kiva pays the principle back to you but does not pay interest. And if the entrepreneur defaults then you do not get your capital paid back (in other words you lose the money you loaned).

They do an excellent job of using the internet to allow people like me to feel connected to people we can help. And in so doing, they do an excellent job of implementing their strategy (providing funds for micro-loans) to achieve their goal (to alleviate poverty). “Kiva’s mission is to connect people through lending for the sake of alleviating poverty.”

Today I added $450 to my loan portfolio with Kiva and donated another $100 to Kiva. I added 5 loans in: Tanzania (2 loans), Uganda, Paraguay and Ecuador.

I am happy with the success of the Curious Cat blogs but I do have one item I wish would improve. I wish more Curious Cat readers would take advantage of Kiva. If you lend through Kiva, please add a comment with a link to your Kiva page and I will add you to our list of Curious Cat Kiva Contributors.

The Kiva web site includes all sorts of data on the partners making the loans (the capital at risk is provided by Kiva donors but a local organization services the loans…). For example, see the profile for Tujijenge Tanzania Ltd. This shows for example the Amount Repaid Vs Expected Rate (100% for this partner – no defaults or delinquency). The rates for all Kiva loans are 3.75% delinquent and .12% defaulted. They also show the Average Interest Rate Borrower Pays To Kiva Field Partner (which is 24% in this example) and the Average Local Money Lender Interest Rate (which is 60%).

One of things I really hope to see is some research on the results Kiva is producing. What kind of changes are these loans bringing about: specifically looking at Kiva. And also looking at various factors such as the interest rate and whether targeting my lending to those with lower average rates results in greater benefit. There is a great deal of unknown and unknowable numbers involved but some data would be interesting as well as analysis even without numbers of results.

Related: Using Capitalism to Make the World BetterFrontline Explores Kiva in UgandaProviding a Helping Hand via KivaExpanding Credit Access: Using Randomized Supply Decisions to Estimate the ImpactsMicrofinance research links

Data Visualization

Data is often displayed poorly, making it difficult to see what is important. When data is displayed well the important facts should leap off the page and into the viewers mind. Edward Tufte is an expert on this topic with great books. If you have not read them, you should: Beautiful Evidence, The Visual Display of Quantitative Information, Envisioning Information and Visual Explanations.

Smashing magazine has some nice examples of good display techniques in Data Visualization: Modern Approaches. I don’t like all the examples they show but it does provide some help by showing some creative ways to display data.

Related: Edward Tufte’s new book: Beautiful EvidenceGreat ChartsData Visualization Example

Data Visualization Example

In Myths About the Developing World, Hans Rosling shows some great graphics to display data on health care outcomes. This is one of the talks from the great TED conference that we have mentioned before. They really have some great webcasts available on their site.

The presentation also gives a concrete example of faulty knowledge (people thinking things which are not so – related to theory of knowledge). He also makes good points on stratifying data at the 14 minute mark. See gapminder.org for good additional material.

Related: Great ChartsOpen Access Education Materials

Visible Data

Effective visual signals are important for effective management improvement: lean thinking emphasizes such ideas. Top 5 Rules of Effective Measurement Boards is an excellent post on how to make measurement effective.

Take the time to find the important measures and then don’t keep data hidden in some drawer or computer file out of people’s view and therefore out of mind. Post the important data for everyone to see. Review the data as changes are made and see that the changes had the desired result. Update the measures when appropriate (for posting visibly – you will of course be measuring more than the few measures that belong on measurement boards).
Continue reading

Targets Distorting the System

I still remember Dr. Brian Joiner speaking about process improvement and the role of data well over a decade ago. He spoke of 3 ways to improve the figures: distort the data, distort the system and improve the system. Improving the system is the most difficult.

There is an interesting article on the effects of distorting the system: Tony Blair says he will ensure NHS targets do not stop people from seeing their GPs when they want to, from BBC News.

The promise follows claims that some GPs’ surgeries are refusing to set appointments more than two days in advance because of the targets.

In order to make the data meet the targets the system is distorted to achieve the target, rather than to serve the customer.

From Peter Scholtes‘ article published in National Productivity Review in 1993, Total Quality or Performance Appraisal: Choose One:
Continue reading

Statistics for Experimenters – Second Edition

Buy Statistics for Experimenters

The classic Statistics for Experimenters has been updated by George Box and Stu Hunter, two of the three original authors. Bill Hunter, who was my father, and the other author, died in 1986. Order online: Statistics for Experimenters: Design, Innovation, and Discovery , 2nd Edition by George E. P. Box, J. Stuart Hunter, William G. Hunter.
I happen to agree with those who call this book a classic, however, I am obviously biased.

Google Scholar citations for the first edition of Statistics for Experimenters.
Citations in Cite Seer to the first edition.

The first edition includes the text of Experiment by Cole Porter. In 1978 finding a recording of this song was next to impossible. Now Experiment can be heard on the De-Lovely soundtrack.

Text from the publisher on the 2nd Edition:
Rewritten and updated, this new edition of Statistics for Experimenters adopts the same approaches as the landmark First Edition by teaching with examples, readily understood graphics, and the appropriate use of computers. Catalyzing innovation, problem solving, and discovery, the Second Edition provides experimenters with the scientific and statistical tools needed to maximize the knowledge gained from research data, illustrating how these tools may best be utilized during all stages of the investigative process. The authors’ practical approach starts with a problem that needs to be solved and then examines the appropriate statistical methods of design and analysis.
Continue reading

  • Recent Trackbacks

  • Comments