This post was sparked by an interesting e-mail exchange with Jeremy Fox, over at Dynamic Ecology. We’d both come across the same announcement of a (very likely) case of research fraud, and had some similar reactions to it. We both knew there was a blog post in it! We agreed to post at the same time, but not to share draft posts. My prediction: we agree on some parts, not on others; but Jeremy’s post is better.
Behavioural economics got a bit of a black eye last week with the revelation that a major study by some very prominent authors is, virtually certainly, based on fraudulent data. What’s really astonishing, if you read that post (and you should) is that the fraud was so stunningly obvious with even a rather shallow dive into the data. Just to pick one thing, a treatment effect in the paper seems to have been generated by taking one variable, and adding to it a random number pulled from a uniform distribution bounded by 0 and 50,000. (Seriously, read the post.) This is such an implausible distribution for a real experimental effect that, once it’s been noticed, it’s about the most flagrant red flag you could imagine.
It’s not just this paper, though. Think about recent fraud cases in ecology and evolution, at least some of which were detected because investigators apparently copied-and-pasted large blocks of numbers in spreadsheets – spreadsheets that were posted publicly under a journal’s data archiving policy.
But why? Not why fraud (that’s an interesting question that Jeremy has dived into before, for instance here, but it’s not my question today). Instead, why clumsy, obvious fraud? There’s an interesting contrast here, I think, with forgeries in art. The “add a uniform random number from 0 to 50,000) strategy seems like taking a can of house paint and a ruler and dashing off a ten-minute “Mondrian”. Art curators, collectors, or auction houses wouldn’t be fooled for a moment. Instead, modern art forgery is an extremely sophisticated business – forgers find authentic period furniture, for example, that can be disassembled to yield wood panels of the right age to paint on. Why the contrast? Why do scientists seem so bad at fraud?*
Well, I don’t have an answer for you. (Perhaps fortunately, I’m not a scientific fraud expert.) But here are a few (interrelated) ideas, and you can suggest others in the Replies.
- Science is fundamentally built on trust. As modern science became professionalized, in the late 19th and early 20th centuries, we changed our minds about how a piece of scientific writing gained authority. In science’s early days, investigators were concerned with having witnesses to their work who could vouch for it. With professionalization, we began to assign authority to research results in part because their author was a member of the profession, with credentials from, and appointment at, reputable institutions. That doesn’t work completely, of course, but the fact that fraud rates (or at least, detected fraud rates) are very low suggests it works to a considerable extent. And since the system operates by trust, it’s vulnerable to even crude attempts at deceit. (Jeremy has explored that idea here.) You can think of this as good news: fraud isn’t testing the system enough to require attempts at fraud to be sophisticated. In art, this isn’t true; art forgery has been a lucrative enterprise for long enough that the fraud-detection system has had to be well practiced.
- Building on that last point: we’re mostly, probably, amateurs at fraud. With perhaps a very few exceptions, there aren’t professional science fraudsters (serial fraud appears even rarer than fraud itself). Art forgers have expertise and practice, and sometimes long careers producing dozens or hundreds of forgeries. Perhaps most scientific fraud is committed by someone who’s just decided to give it a whirl, under time pressure as the job market, a tenure deadline, or some other motivator looms. If fraud happens that way, perhaps it’s no surprise it’s slapdash. But to undercut my own suggestion: are we really that amateurish? We’re pretty good at experimental design, data analysis, and all the rest. Do our skills really not extrapolate to fraud?
- What if our skills do in principle extrapolate to fraud, but in practice, we don’t apply those skills? Here’s the thing: it’s likely that sophisticated fraud is difficult. Sure, it would have been trivially easy, for that economics paper, to draw “treatment effect” values from a normal distribution instead of a uniform one; but that’s only one of the many red flags in that paper. Nature does really weird and complicated things to our real data. Simulating that convincingly has always struck me as very difficult. I mean, I don’t actually need reasons not to fake my data; but if I did, my complete confidence that I’d suck at it would provide one! To take this one step further: it seems likely to me that conducting a really convincing, sophisticated scientific fraud would be more work than just doing the experiment for real.
- Finally, the most troubling possibility (and it’s so obvious that I’m sure you’ve gotten there before me): maybe scientists do conduct sophisticated fraud, but it goes undetected. Under this explanation, the fraud cases in the news are the crude ones simply because those are the ones we notice easily; but if we put the effort in to thoroughly review our literature, we’d unveil a wide distribution of fraud “quality”. Imagine a normal (not uniform, for Pete’s sake!) distribution of fraud quality, with the veil line due to limited inspection currently hiding all but the rare crude cases in the left (low-quality) tail. As we poured in more resources, we’d see more and more sophisticated cases, and likely more of them. (Because it would be, frankly, embarrassing if the crudeness of the uniform-0-to-50,000 case was the mode!). This idea will appeal to those who think that everything in science is “broken”, and I have to admit it isn’t impossible – but it’s very difficult to test. In art, substantial resources go into the investigation of forgery, because the costs of forgery are very high: a work could sell for $100,000,000 if judged genuine, but be near-worthless if that judgment is wrong. Do we have these same incentives in science? Arguably not – I’d take the perhaps controversial position that it’s unusual for a fraudulent paper to have very high costs. That’s because we rarely base our understanding of nature on a single paper – especially when we intend to build major investments (science, policy, or otherwise) on that understanding. I think it’s just hard for a single paper (fraudulent or not) to move the needle. So it’s rational for us not to invest more in fraud detection – even if it’s disturbing to think about what we might find.
So, four ideas (and please offer others in the Replies). They each seem plausible to me – but none stop me from being utterly gobsmacked each time I see another example of fraud that’s so crude it’s almost clownish. It would be amusing, if it wasn’t – no, wait, I have to admit: it IS amusing.
© Stephen Heard August 25, 2021
Image: How an ecologist might forge art. Public domain, via Maxpixel.net
*^Jeremy alerted me to this Retraction Watch post (from yesterday), which describes a faker’s “elaborate steps to cover her deception”. However, the faker in question has had multiple papers retracted, grant funding yanked, and her medical license (temporarily) revoked, all without the apparent need for Sherlock Holmesian deduction. So I’m not sure she’s really that good an example of artful scientific fraud.