The scientific method is a toolkit that provides the only reliable way to learn how things work in the real world. But some people have come to distrust it because science keeps changing its mind. It’s supposed to tell us “The Truth” in black and white and capital letters, but instead it’s all grey and wishy-washy. One day it tells us coffee is bad for us; the next day it says coffee is good for us. What are we to believe?
Here’s how a typical progression might go (Disclaimer: This example is fictional—any resemblance to real kumquats, living or dead, is purely coincidental.):
- A news report says “New study shows that people who eat kumquats live 40% longer.” So you rush out and buy kumquats and start eating them regularly. You feel smart and virtuous.
- The next year, another news report says “New study shows that people who eat kumquats don’t live that much longer after all: they only live 5% longer.” So you keep eating kumquats, but with less enthusiasm. You feel disappointed.
- The next year, another news report says “New study shows no difference in longevity between those who eat kumquats and those who don’t.” Now you don’t know what to do. Perhaps you stop eating kumquats. You are confused, unhappy, and resentful.
- The year after that another news report says, “New study shows people who eat kumquats don’t live as long and are twice as likely to develop cancer.”
Now you are really upset. Science has deceived you. It has let you down. It has tricked you into doing something that may have hurt you. How can you ever trust it again?
The irony is that this kind of thinking pushes people to embrace pseudoscience and quackery. That’s stupid. If science is flawed, it doesn’t make sense to replace it with something even more flawed. If your expert mechanic hasn’t been able to fix your car, you shouldn’t expect your barber or your 5 year old to do any better. Science is based on evidence and is committed to re-evaluating its provisional conclusions as the evidence changes. Isn’t that better than relying on a prescientific medical system like homeopathy where beliefs never change and disconfirming evidence is ignored? As comedian Dara Ó Briain says,
Science knows it doesn’t know everything; otherwise, it’d stop. But just because science doesn’t know everything doesn’t mean you can fill in the gaps with whatever fairy tale most appeals to you.
Changing reports don’t mean you can’t trust science. They do mean you can’t always trust the way science is reported in the media. And they mean science is complicated. The collective enterprise of science is reliable; individual studies are not. The kumquat progression is not a failure of science, it’s an illustration of the success of science. It’s exactly the way science is supposed to work.
The first study of a new hypothesis is usually a small preliminary test not intended to reach a definitive answer or to guide clinical applications. We use these “pilot studies” to guide more definitive research. They are typically followed by much larger, more rigorous studies and attempts by other scientists to replicate the findings. Sometimes those studies conflict with each other. Scientists try to tease out the possible reasons for the discrepancies. All these studies are submitted to peer review and published where they can be scrutinized and critiqued by other experts in the field. Eventually a consensus is reached based on the quantity and quality of all the evidence that has been published.
We can never go by the results of what any single new study shows. We must weigh all the evidence. And of course the evidence must be plausible in the context of other scientific knowledge. It would take an extraordinary deluge of evidence, for example, to prove the extraordinary claim that water remembers a homeopathic substance after it has been diluted away.
CORRELATION DOESN’T PROVE CAUSATION. I put that in capitals because it is critically important that we never forget it, and the vast majority of “New study shows” reports are only reporting a correlation. The increase in autism diagnoses is correlated with the increase in piracy, but that doesn’t mean pirates cause autism or autism causes pirates. Yet when a study shows a correlation between violence and the number of hours kids watch TV, we are tempted to automatically assume that means children shouldn’t watch so much TV. We don’t stop to ask if there might be a reason violence-prone kids watch more TV, or if some other unrelated factor might lead to both violence and increased TV watching (such as poor parenting in a bad socioeconomic environment where latchkey kids use TV as a babysitter and witness violent acts in their community). And even if excessive TV leads to violence, that doesn’t necessarily mean that restricting TV would effectively prevent violence.
You can’t assume anything in science. No matter how intuitively obvious and logically compelling something seems, you still have to test it. Homocysteine is an instructive example. High blood levels of homocysteine correlate with the risk of heart disease. B vitamins reduce the level of homocysteine. Therefore B vitamin supplements ought to prevent heart disease. But they don’t. They do effectively lower blood homocysteine levels, but they don’t reduce the risk of heart disease.
A new study shows… but that doesn’t mean we can believe it. A classic study by John Ioannidis has taught us that most published research findings are wrong.1 Here are just a few of the factors that can contribute to that unfortunate situation:
- Researcher bias. If the kumquat study was done by the Kumquat Grower’s Association, their conscious or unconscious biases may have influenced the results.
- The File Drawer effect. They may have already done 9 kumquat studies and gotten insignificant or negative results that they simply filed away. Then Study #10 got positive results and they submitted only that one for publication.
- Publication bias. Scientific journals tend to reject negative studies that didn’t find anything significant and publish only the ones with positive results.
- Poor research design or execution. Inadequate controls or even no control group. Contamination in the lab. Tweaking of data by research assistants who knew what the boss wanted. Too few subjects. Too many dropouts. Study period too short. Incomplete reporting of data. An inappropriate statistical test for the kind of data collected. They may have done the math wrong. Maybe they relied on subjects to report how many kumquats they ate last year but memory is notoriously unreliable and people tend to exaggerate and tell researchers what they think they want to hear.
- False positives. The p=0.05 cutoff typically used in clinical studies means that if the hypothesis is false, 1 in 20 trials could still produce false positive results.
- Statistical significance doesn’t mean clinical significance. In a large study, it might be statistically significant that a drug lowers blood pressure by 1-2 mm, but that’s not enough to make any real practical difference to a patient.
- The data may have been invented or falsified. Several instances of research fraud have been prominently featured in the news recently; I bet you can think of at least one case.
- Multiple endpoints. Maybe they looked at the effects of kumquats on 30 different conditions, from heart disease to arthritis, from cancer to kidney disease; and only found one that was correlated. If you look at enough things, you are almost guaranteed to eventually find a spurious correlation somewhere.
- And the list goes on… Sometimes the write-up of the study may be flawed. I’ve seen all too many published studies where the data don’t justify the conclusion, or even where they justify the exact opposite conclusion! Editors and peer-reviewers are supposed to weed those out, but they sometimes do an incompetent job.
Few journalists, even science journalists, have a deep understanding of science. They may not appreciate the vast gulf between measuring a chemical reaction in a few rat cells in a test tube and doing a randomized, placebo-controlled, double blind clinical trial. They want those bylines and column inches. Sensationalism trumps measured judgment. Attention-grabbing headlines bear little resemblance to what the study actually showed. Editors are more interested in selling newspapers than in accurately portraying reality.
What to do? Whenever you hear new study shows… take a deep breath and activate your baloney detector. Be patient. Don’t change what you are doing before you see further evidence. File it away in the back of your mind and stay tuned for subsequent developments. You can check PubMed to see if any studies have found different results. You can check to see if anyone has critiqued the research in question. You can check reliable science blogs and other Internet sources for the interpretations of scientists in the field who can put the findings into perspective.
The single most important thing you can do is remember the SkepDoc’s Rule: before you accept any claim, try to find out who disagrees with it and why. There is always disagreement, even about whether vaccines cause autism and whether men landed on the moon. Once you have located the opposing arguments you can evaluate which side has the most credible evidence and the fewest logical fallacies. It’s usually easy to spot the winner.
One of my friends has described it as “the slow lumbering beast we call science.” The behemoth is clumsy; it stumbles and meanders in its quest, but its course is ultimately self-correcting and it inexorably trudges towards its final goal: the truth.
This article was originally published as a SkepDoc column in Skeptic magazine.