Caspers Albers explains the numbers
A statistician on a mission
All you need to understand statistics, says adjunct professor of applied statistics Casper Albers, is some common sense. It’s really not that difficult to grasp. So why do so many people struggle with it? ‘Everyone keeps saying how difficult it is’, says Albers. ‘Take the flyer for psychology, for example. It states no fewer than three times how difficult statistics is and that students need to realise that before starting the programme.’
Students who failed their first statistics class or researchers whose papers were denied due to a lack of a proper statistical foundation might disagree with him.
A statistician saying his job isn’t that difficult is like a maths teacher who fails to understand why every child doesn’t have a natural ability with square roots and integrals. But Albers maintains his stance. ‘Look at it like it’s a car: to get around, you don’t need to know how to fix it when it breaks. You just need to know the traffic rules.’
Damned lies
The ‘fix it when it breaks’ work, such as developing calculation software, figuring out the maximum deviation allowed, p values, and statistical significances, is better left to the professionals. But the rest is easily mastered.
We don’t know why people struggle with statistics so much
But what’s the best way to teach statistics? Not many people are just dying for a lesson on the theory of probability. And while Albers writes columns for de Volksrant, the Nieuw Archief voor Wiskunde, and UKrant, there are only so many people he can reach that way. And audience aside, it’s a mystery what makes it so universally hard to get into: ‘we don’t know why people struggle with statistics so much’.
In his inaugural lecture at the Academy building on Tuesday, Albers committed to trying to figure that out over the next few years, and how best to communicate statistical information.
After all, statistics are important. Computers are constantly processing enormous amounts of information, creating election polls, calculating the risk of getting cancer by eating broccoli, or providing information on criminality among young foreigners in the Netherlands. It’s a wealth of information, but this information runs the risk of being misinterpreted or misused. ‘There’s a reason the quote goes “There’s lies, damned lies, and statistics”’, says Albers. ‘You can use statistical information to mislead people. Either deliberately, for your own political agenda, or accidentally.’
Selective shopping
Even researchers who mean well can make mistakes, Albers says. ‘They do their studies, come up with an intervention, and then expect a certain result. And with good reason. But sometimes the data doesn’t show that result. What are they to do?’
People tend to see the results they were expecting
All too often, researchers take another really good look at their research and end up deciding to retroactively leave certain factors out of it, which get them the results they expected. ‘It’s called selective shopping and it’s a big no-no in statistics’, says Albers. ‘But people tend to do it anyway. They have a paper; a looming deadline. On top of that, people tend to see the results they were expecting to see.’
Another thing that happens all too often is researchers adding test subjects to their group because the smaller group didn’t lead to the expected results.
Cheating
This, too, constitutes statistical cheating. ‘Say you want to get two sixes on two dice throws. The chances of success are 1/6 over 1/6. But what if you throw a three the first time and decide to change your outcome to three threes? That changes the odds from 1/36 to 1/6. You don’t need to have taken advanced maths to understand that.’
Sometimes researchers have done all their calculations correctly but falter in their communication with the outside world. This was the case in a study that showed that bacon is carcinogenic. It showed that when men eat one hundred grams of bacon every day for ten years, their chance of developing colon cancer increased from 5 percent to 6 percent.
‘But the press wrote that it resulted in a “twenty percent increase in the chance to get cancer”’, Albers recalls. ‘In percentage points, however, the increase was minimal. Also, no one in their right mind would eat a hundred grams of bacon a day for ten years. Anyone who eats that much fatty meat would die of a heart attack long before the cancer got them.’
Ivory tower
Albers believes it’s possible to change the lackadaisical attitude people have towards statistics. He wants things to be different from his experience as a fledgling researcher, when all he did was write complicated psychometric models for publications no one reads. ‘After five years I’d only been cited a handful of times.’ He says he was shut up in the ivory tower, building models that no one was using.
Statistics is just a matter of logic
But when he started a Twitter account to tweet about his work five years ago, he realised that there was an entire world of people out there who don’t know a single thing about statistics. He wanted to reach them – but how? He committed to finding ways to teach a wider audience about statistics.
‘The current industry guidelines for writing about statistics are purely aesthetic: take into account that some of your readers will be over fifty, so use a large font; don’t use green and blue in the same chart, because that will confuse people who are colour-blind, etc.’
But no one has ever taken into account the psychological factors at play when people look at statistical visualisations. ‘Take a bar graph, for example’, says Albers. ‘You can display that vertically, but you can also rotate 45 degrees.’
Message
When a graph is displayed vertically, people will immediately focus on the highest bar. But when it’s rotated, people will have a harder time reading it correctly. ‘They won’t see the differences between the various bars as distinctly as they would in a vertical visualisation’, says Albers.
The information in the graphs is the same. Researchers should ask themselves what message they want to convey to their audience, and how to do that.
Another example is the network models used in some publications. In these ‘ball graphs’, the important factors are displayed in little balls connected by lines of varying thickness. It’s a nifty way of putting a lot of information in the visualisation. ‘Some graphs will have hundreds of lines’, says Albers.
Unexpected
Albers found, however, that most people presented with these models only see whatever they expected to see in the first place. ‘It draws attention away from any new or unexpected information. It renders the models useless.’
Albers’ mission is twofold: on the one hand, he wants to try to explain to researchers how statistics works and how to handle their own results. On the other, he wants to explain to a wider audience how to interpret newspaper headlines. ‘I want to give people the tools to understand’, he says. ‘Statistics is just a matter of applying logic to numbers. Most people can do that just fine.’