Wheelan opens Naked Statistics with the admission that he sometimes struggled to see the relevance of what he was learning as a math student. Therefore, he puts the relevance of statistics front and center in the book, building his discussion of each statistics concept around why we should know about it. Better yet, Wheelan proves that statistics don’t need to be intimidating by putting the math behind statistics into digestible terms and explaining concepts with relatable, relevant, and even humorous examples.
This guide largely focuses on two main themes in Naked Statistics. First, we cover what many common statistics mean, how to interpret them, and why they matter. Like Wheelan, we use real and fictional examples to add context to each statistic covered. Second, we examine Wheelan’s discussion of the consequences of bias and the misapplication and misinterpretation of statistics to make the case that everyone should develop basic statistical literacy.
We rely on data to make sense of the world, but without statistics, datasets would be largely useless. Imagine asking a car salesperson what kind of mileage a car gets, only to get a 100-page spreadsheet of the individual miles that car has driven and how much gas it used each mile! While the spreadsheet may be comprehensive, it’s also pretty useless if you were hoping for a quick answer. With statistics, we can take unwieldy datasets and transform them into meaningful and actionable values, like average miles per gallon.
Statistics that summarize datasets are called descriptive statistics. Two of the most familiar and commonly used descriptive statistics are the mean (the average) and the median (the middle number when you put all of your data in numerical order). The mean and median are called measures of central tendency, and while they both tell us about the “middle” of a dataset, Wheelan explains that they can convey very different messages. With a basic understanding of statistics, we can learn when to use one over the other and spot when someone might be reporting the mean instead of the median (or vice versa) to further an agenda.
Say the beach authorities at a fictional beach were collecting data on the number of jellyfish stings swimmers suffered each week throughout the summer. The data might look something like this:
| Jellyfish Stings/Week/500 swimmers | |||||||||||
| June | July | August | Sept | ||||||||
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 3 | 50 | 150 | 300 |