Reject Summary Statistics
A summary statistic is what you get when you reduce a bunch of numbers down to a single number:
But it doesn’t have to be the mean, it could be:
- median / percentile / IQR
- mode
- min / max / range
- variance / standard deviation
- etc…
Wikipedia says:
[…] summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible.
Why are summary statistics bad?
Summary statistics are extremely lossy.
They take all the data – full of nuances, patterns, outliers, and special cases – and collapse it down to a single point.
What if the people you know included a basketball player?
Yes: 178.6
is bigger than 177.4
… but it doesn’t begin to tell the real story:
If you don’t look at the data, you don’t understand the data…
It’s almost a cliché, but Anscombe’s quartet clearly shows the limits of summary statistics:
Those four data sets have the same:
- mean (both x and y)
- variance (both x and y)
- correlation
- linear regression
A more recent and striking example is the datasaurus dozen:
A terrible example: “average” request duration
If you’ve ever been asked:
What’s the average request duration?
for an HTTP endpoint, you know there’s only one good answer:
urgh…
The truth is that “it’s complicated” and the question itself is based on many wrong assumptions:
- the distribution is normal … it’s not
- the distribution has one mode … probably not, depending on caching, or the specific if-then-else handling, etc…
- the mean is “exact” … rather than using a confidence interval
Here is a real duration graph distribution:
Details:
- this is a density plot, a “smoothed” histogram
- yes, this is from a real production system
- it spans thousands of requests over a 1-hour window
- durations were cropped <= 250ms
- the mean duration is 62.4ms (56ms for the cropped data) – raw data
The mean duration sits exactly nowhere interesting or representative.
A failure of communication
Every time I hear/read the word “average”, I assume the worst. The average, or any summary statistics, obscures the real data, incompetently at best or maliciously at worst.
Summary statistics feel like information, but they are usually sound bites.
As a society, we need to strive for facts, for understanding, and for objective data:
- REJECT summary statistics: ask for the data if it’s missing
- refuse conclusions until the data and methodology are produced and reviewed
- raise the bar: ask for visualizations – but visualizations do NOT replace the data, they complement the data