I wrote recently about the reproducibility “crisis” and its connection to the history of our Methods, and some discussion of that post prompted me to think about another angle on reproducibility. That angle: is our literature a big pile of facts? And why might we think so, and why does it matter?
John Ioannidis (2005) famously claimed that “Most Published Research Findings Are False”. This paper has been cited 2600 time and is still frequently quoted, tweeted, and blogged.* Ioannidis’ paper made important points about the way natural variation, statistical inference, and publication culture interact to mean that we can’t assume every statistically significant result indicates a truth about nature. But I think it’s also encouraged us as scientists to over-react, and to mislead ourselves into a fundamental misunderstanding of our scientific process. To see what I mean, start with Ioannidis’ very interesting opening sentence:
“Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment.”
I say this is an “interesting” sentence because I think raises an important question about us and our understanding of the scientific process. That question: why should we experience “confusion and disappointment” when a published study isn’t backed up by further evidence? Surely this is an odd reaction, and one that only makes sense if we think that everything in a journal is a Fact, and that our literature is a big pile of such Facts – of things we know to be true, and things we know to be false.
This is indeed the way science often gets taught in elementary and high school, but surely as scientists we ought to know better. After all, we explain this all the time to our non-science friends, decrying the science-as-a-big-pile-of-facts approach to science teaching. Science is a process, not an assemblage of facts, and we make progress by taking small steps and updating our understanding through synthesis of many different results. There are things we are very sure of (gravity points down, species evolve by natural selection), things we’re pretty sure of (secondhand smoke exposure increases cancer risk), and things we’re identifying as new and interesting findings that we’re not yet sure of (insect herbivores have more impact on evolutionarily novel hosts). Surely nobody thinks that everything that we publish as having P<0.05 is going to hold up as a true statement about the universe? Instead, we ought to see such studies as interesting observations that will be tested by further studies**.
Note that when I say “tested by further studies” I’m not suggesting that every study will be, or should be, replicated. While we tell each other that this is how science works, it largely isn’t. Instead, results gain authority by their consilience with other related studies – by the accumulated weight of findings, by our ability to build further understanding on top of them, or by the way they make sense of many disparate observations. We may get to consilience via formal meta-analysis, less formal synthesis, or just by seeing that studies building on an earlier report work out the way we’d expect if that earlier report were true. This system might seem kludgy, but it’s been working very, very well for us for hundreds of years.
Now, this doesn’t mean I think our publishing culture is perfect. One of the big issues Ioannidis’ claim drew our attention to is the problem of publication bias: the “filtering” of results for publication by statistical significance. The task of verifying and consolidating knowledge is greatly complicated by the veil drawn over “negative” results. While we have lots of techniques that help (funnel plots in meta-analysis, for example), it would surely be better if we could come up with ways of publishing, or at least registering, nonsignificant results so they can give us perspective on the significant ones! It’s been encouraging to see some recent progress here – for instance, the existence of the Journal of Negative Results in Biomedicine.
I think I’ve made three major points here. First, we shouldn’t be scandalized that any individual published paper may be “wrong”. Of course it may; it isn’t its job to be individually, infallibly right! Second, the way we’ve been doing science (“wrong” published findings and all) works, overall, rather well: our understanding of the natural world is impressive and getting more so all the time. And finally (of course) the way we do science isn’t perfect, and we can and should improve our systems where we can. We just ought to take a deep breath first, think carefully about how discovery works, and try not to let the idea that Our Literature Is A Big Pile Of Facts trap us into too much hand-wringing. It isn’t, and it shouldn’t be.
© Stephen Heard (firstname.lastname@example.org) April 7, 2015 Image: A big pile of facts? Photo S. Heard.
*Interestingly, two followup studies, each showing that the “problem” is less severe than Ionnadis claimed, have been cited just 120 (Moonesinghe et al. 2007) and 37 (Goodman and Greenland 2007) times: 5% and 1% of the citations for the original. This probably says something about the degree to which we get much more excited about apparently bad news than we do about apparently good news.
**It’s possible that mathematics, or at least pure mathematics, operates very differently: if each paper is a proof, and if proofs seldom contain errors, then that literature is arguably a big pile of Facts. I hope a pure mathematician will read this and enlighten me via the Comments.