Confounding Variables

by Craig Shrives

What Are Confounding Variables?

In Statistics, a confounding variable is one that distorts the relationship between two other variables. Put simply, when people present the findings of their data analysis, they don't always consider everything they should.
What are confounding variables?
statistics half story argument Here is another way to think of confounding variables. When discussing Critical Thinking, an argument is usually constructed as shown in the diagram.

In the analysis of statistics, the premises are the results of analysing datasets and the environmental factors. The inference describes the relationship between the premises. In this equation, a confounding variable often takes the form of an absent premise.

All too often, a company presenting a conclusion will omit (either deliberately or unknowingly) at least one premise that ought to be included in its argument. As a result of leaving out this premise, the inference (i.e., how the company got to its conclusion) is likely to be meaningless, and its conclusion is highly likely to be false.

Tomatoes Used to Be Poisonous (Confounding Variable Example)

statistics half story argument2 Let's look at this example about tomatoes. Clearly this is nonsense. Common sense (and a basic grasp of maths and human mortality) tells you that old age killed off the pre-1910 tomato eaters, not the tomatoes. The premise "Most humans don't live beyond 100" has been omitted from this argument. That missing premise is known as a "confounding variable". A confounding variable is one that distorts the relationship between two other variables. Put simply, when people present statistics, they don't always consider everything they should.

Marijuana Is the Leading Cause of Traffic Accidents (Confounding Variable Example)

Usually, the confounding variables are far harder to spot:

"During the summer holidays, one in three drivers involved in traffic accidents tested positive for marijuana. Therefore, marijuana is causing people to drive recklessly."

statistics marijuana There are several significant confounding variables in this example, but here is a key one:

People will test positive for marijuana for 30 days after taking it. So, if I took marijuana three weeks ago and had an accident, I would be one of those drivers who tested positive, but the marijuana would almost certainly have had nothing to do with the accident. However, if an upstanding member of society believes marijuana is a pernicious influence on today's youth, he can use this statistic to blame marijuana for a high percentage of car crashes and to support his argument for stiffer sentencing. If this were a real example, we would have cause and effect linked not by scientifically supported evidence but by an illusion of evidence using statistics.

The point here is that simple statistics in support of a bold conclusion are easily challenged. You can almost go on forever challenging statistics like these. I bet almost 90% of those involved in the traffic accidents would have proved positive for coffee. What are we supposed to conclude from that? Coffee causes more accidents than marijuana? What about the reverse logic? Two thirds of those involved in the accidents did not prove positive for marijuana. Therefore, logically, these statistics could show it is twice as safe to drive under the influence of marijuana than not. That's nonsense of course. To understand the significance of the statistics presented, we would need to know a few baseline figures like the percentage of the driving population that would test positive for marijuana routinely (i.e., before having a crash).

This is a simple example, and, already, we've found fault with the inference and the conclusion by identifying other confounding variables or by highlighting why the inference is biased (e.g., marijuana is a cause when 33% proved positive, but coffee isn't a cause when 90% proved positive).

Statistics can be attacked easily, and one of the best ways to do it is to identify the confounding variables that the originator left out.

"More Crimes Are Committed During a Full Moon" (Confounding Variable Example)

statistics full moon There's a theory out there that more crimes are committed during a full moon than during other phases of the moon. It's certainly a statistic that is believed by lots of policemen on the beat, and it's been backed up on more than one occasion by crime-database analysis. Surely, there can only be one explanation: it's the inner werewolf in us all. A full moon obviously makes us all go a little bit crazy. It is, after all, where the word lunatic comes from. Well, can a full moon make us all go a bit mad?

Soldiers will tell you that going on night-time patrol during a full moon is a bad thing. Well, it's good a thing to see where you are going, but it's a bad thing for remaining undetected. Statisticians who have studied this lunar effect (or the Transylvanian Effect as it's also known) are divided as to whether the rise in crime rate during a full moon is a statistical anomaly (probably caused by a small dataset) or because more criminals are seen plying their trade in the increased moonlight.

Now, if someone stood up publicly and presented the "inner werewolf" idea and backed it up with some very comprehensive statistics collected across every police force on the planet for the last hundred years, his presentation would be debunked instantly as soon as you raised the better-light-conditions idea. That's the power of finding confounding variables in others' statistics. It's also one of the dangers of spinning statistics to support your arguments.

Critical Thinking Test

Are you good at spotting the biases, fallacies, and other cognitive effects? Can you spot when statistics have been manipulated? Can you read body language? Well, let's see!
gold cup
Gold

gold cup
Silver

gold cup
Bronze

0
  • This test has questions.
  • A correct answer is worth 5 points.
  • You can get up to 5 bonus points for a speedy answer.
  • Some questions demand more than one answer. You must get every part right.
  • Beware! Wrong answers score 0 points.
  • 🏆 If you beat one of the top 3 scores, you will be invited to apply for the Hall of Fame.
Scoring System

Guru (+)
Hero (+)
Captain (+)
Sergeant (+)
Recruit (+)
Help Us To Improve

  • Do you disagree with something on this page?
  • Did you spot a typo?
  • Do you know a bias or fallacy that we've missed?
Please tell us using this form

See Also