Saturday, January 5, 2019

Lying with Statistics


In my last post I mentioned How to Lie with Statistics by Darrell Huff, written way back in 1954. The title is catchy. The lessons are important. I think all middle or high school students should read something like this. It’s short and easy to read. There is no complicated math, although the book illustrates the importance of quantitative reasoning. The only drawback to the book is that the numbers used to illustrate all manner of practical everyday situations are from 65 years ago and older. Personally, I found it interesting to know what average incomes were back in the day and how much household items cost.


(The cover above shows the 1993 Norton paperback reissue that I borrowed from the local library.) 

Reading the book’s introduction, I’m struck by how much more relevant the lessons are in today’s confusion of data, facts and truths. Here’s what Huffman has to say. “The secret language of statistics, so appealing in a fact-minded culture, is employed to sensationalize, inflate, confuse, and oversimplify. Statistical methods and statistical terms are necessary in reporting the mass data of social and economic trends, business conditions, “opinion” polls, the census. But without writers who use the words with honesty and understanding and readers who know what they mean, the result can only be semantic nonsense.”

Will Huffman teach you how to twist those facts so you can use them to your advantage? I suppose so, but his intentions seem more noble, as illustrated below. He also introduces an appropriate new word for all this activity. “Misinforming people by the use of statistical material might be called statistical manipulation; in a word (though not a very good one) statisticulation.”


(The book also has amusing and clever illustrations by Irving Geis. I like how chemistry and biology are brought into the picture.)

What are some lessons from his book?
·      How to draw a biased sample, intentional or not.
·      Adding more numbers after the decimal can be used for good and ill.
·      An “average” can mean many things. Choose appropriately.
·      Normal and desirable can be made to sound the same even though they mean different things.
·      One can make a big deal out of thin air using small numbers with no error bar.
·      Confuse Correlation and Causation.
·      Changing graph axes and using one-D pictures of three-D objects gives different impressions (or conclusions).
·      Use different baselines for percentage increases and decreases.

Here are some figures from the book illustrating the latter points.




As much as I enjoyed my quick read through Huffman’s book, if there was a book I would recommend for students to actually learn some of the statistics, I would recommend Naked Statistics. It’s very well-written and covers the key points clearly with good examples. But for a quick fun read, How to Lie with Statistics does the job.

No comments:

Post a Comment