I Don't Think That Number Means What You Think It Means (No Pun Intended)
Statistics don't lie, but they can mislead
Figures don't lie, but sometimes people are confused by what certain numbers denote (I’m not going to write “mean” again — yet).
You have probably heard people say something like, “In the olden days, people only lived to be 45 years old.”
But was that really the case, that people died of old age at 45, in a sort of Benjamin Button (progeria) fashion? No. Those who think that are confused by averages. Many people in 19th Century America lived to be eighty or ninety years old, the same way many of us today still do. But there were more who died young, skewing the average (or the “mean,” to be more precise).
Consider this: Many women gave birth to lots of children in those times, sometimes as much as a dozen or more. However, a fairly large percentage of their babies died at birth or shortly thereafter, or from childhood diseases when they were still quite young. Actually, if you lived through those dangerous events (birth) and years (when susceptible to childhood diseases), your chances were pretty good of living to a “ripe old age.”
Take a family with ten children, who lived to these ages (where “0” means they died at birth):
Father: 76
Mother: 80
Child 1: 82
Child 2: 0
Child 3: 2
Child 4: 85
Child 5: 76
Child 6: 5
Child 7: 68
Child 8: 3
Child 9: 91
Child 10: 8
The average lifespan of the family was 48 (576 / 12), but there were three octogenarians and one nonagenarian in the group. So you see, average (the mean) can be misleading.
It is sometimes more telling to use the median rather than the mean (or “average”). Let’s list the ages from youngest to oldest:
0, 2, 3, 5, 8, 68, 76, 76, 80, 82, 85, 91
In this case, the median (the number in the middle) is 72 (68 and 76 are in the middle, and the number between them is 72). This is a pretty good estimate of how long you could have expected to live if you died “of old age” rather than the aforementioned causes of an early demise.
Another method of arriving at a representative figure is the mode, which is the number in the set that occurs most frequently. In our small sample, it is 76, as that is the only lifespan that appears more than once (appearing twice). Again, this (the mode) gives a clearer picture of “how long people lived” back then than the mean, or average, does.
Figures don’t lie, but they can mislead the unwary or inattentive. As Mark Twain put it, “There are lies, damned lies, and statistics.”