Turning our scientific lens on our scientific enterprise: a randomized experiment on double-blinding at Functional Ecology

Image: Experiment, © Nick Youngson via picpedia.org, CC BY-SA 3.0

I’m often puzzled by the reluctance of scientists to think scientifically and do science.  “Wait”, you say, “that’s a bizarre claim – we do science all the time, that’s why we’re called scientists”.  Well, yes, and no.

We love doing science on nature – the observations and experiments and theoretical work we deploy in discovering how the universe works.  What we don’t seem to love nearly as much is doing science on ourselves.  By “ourselves”, I really mean the scientific enterprise: the way we deploy those research tools, the ways we organize and operate our teams, and – hewing most closely to my own interests – the ways we write up and publish our results.*

Let’s focus just on that last one: writing and publishing.  There are so many questions that get hotly debated (some, incandescently debated).  Are serif fonts or sans-serif fonts more readable?  What about text with one space or two after a period?  Will members of the general public actually read our papers, if we publish open access so that they can?  Should papers, and paper titles, include or avoid humour?  These are all issues for which the frequent expositions of strong opinions virtually never draw on data.  In fact, they’re all issues for which data are scant, if they exist at all – a situation that extends to much of our understanding of scientific writing and publishing.

Now you have the context for why I’m so pleased to see that this week, the journal Functional Ecology is launching a randomized experiment on “double-blind” review (double-anonymous is a better term, but a less familiar one).  In double-blind review, the authors’ identities are hidden from reviewers** and (as is more usual) reviewers’ identities hidden from the authors.  The intent of making authors anonymous is to reduce reviewer biases, both conscious and unconscious.  This is most frequently discussed with respect to author gender, but there are plenty of other possibilities: biases based on career stage, institutional affiliation, nationality, reputation (fame***), and so on.

Double-blind review isn’t particularly new: among eco-evo journals, for example, The American Naturalist went to double-blind back in 2015, and it wasn’t the first.  There have even been some studies seeking to measure the effect of double-blind review on gender gaps in acceptance rates (and related variables).  The thing is, most of these have been small, or have had design compromises (such as before-after comparisons that confound double-blinding with time).  That’s what’s exciting about the Functional Ecology experiment (here’s their press release, and here’s an editorial from earlier this year outlining their rationale for the study).  It’s a 2-year experiment, which they forecast will involve about 2,500 manuscripts – a sample size sufficient not just to measure gender-specific effects on acceptance rates but also to dissect patterns much more finely, testing for mitigation of many other biases too.  It will have true randomization of submissions to double-blind vs. single-blind review, no opt-out (well, the opt-out is simply to submit somewhere else), and minimal burden for authors (they’ll need to upload title pages and acknowledgements separately, but otherwise, the journal is providing all the work necessary to execute the experiment).  At least for eco-evo, this should tell us what we need to know: is double-blind review sufficient to remove reviewer bias, an important arrow among others in our quiver, or merely a distraction?

Now we come to the part where I admit that I’m no better than anyone else. I have an opinion on double-blind review. I’m a fan, having praised (both privately and publicly) efforts to introduce it at journals I’m involved with. But that’s the precisely the problem – what am I doing praising it, when that praise is based on plausibility rather than evidence?  At least, the evidence that double-blind review helps is weak (quite possibly not because it doesn’t help but because attempts to test the hypothesis have been weak).  And yet I’d have felt terrible opposing such a sensible approach to reducing bias.  At the risk of sounding self-serving, maybe the right stance is this: in favour of doing double-blind review now, and also in favour of strong attempts to test its value.

So: kudos to Functional Ecology; it looks to me like they’re doing this right.  In a couple of years we’ll have excellent data.  In the meanwhile, let’s keep on exploring ways to reduce biases of all sorts, in all aspects of science.

© Stephen Heard  September 3, 2019

By the way, if you’re curious as I was, the journal applied for human-subjects ethics approval and the project was deemed exempt.  Participants (submitting authors) will be informed of the study and how it works during the manuscript-submission process.


*^Roughly, these sorts of questions are what make up the discipline of “science studies”.  Many scientists have never heard of this discipline; and many science-studies researchers don’t apply scientific tools to it.  (In both cases, it isn’t none; but it is many and possibly most.)

**^It never takes long for someone to insist that there’s no point scrubbing authors’ names from papers because reviewers will easily guess them.  Guessing is in fact easy if the manuscript is posted as a preprint, and the reviewer thinks it’s appropriate to search for that.  Otherwise, in most cases it really isn’t as easy as people think.  Similarly, people often think they can guess the identity of anonymous reviewers; editors have data on this, and believe me, such guesses are usually wrong.

***^Imma let you finish reading this post, but first: “fame” for scientists is a relative thing, of course; even the most famous among us is no Taylor Swift.

Advertisements

1 thought on “Turning our scientific lens on our scientific enterprise: a randomized experiment on double-blinding at Functional Ecology

  1. Tom Saunders

    Good to see a relatively large, long term experiment of this nature. I find this area complicated because anonymity and transparency in peer review are both good and bad for different reasons, and may benefit and disadvantage different groups of people.

    Double-blinding should reduce the kinds of known biases that disadvantage researchers who are not older, white, male, native English speakers, and affiliated with prestigious institutions (e.g. https://doi.org/10.1038/387341a0, https://doi.org/10.1002/asi.22784).

    But on the other hand, anonymous reviewers have less incentive to provide constructive feedback. For example, signed reviews were less aggressive in their tone, and took longer to produce as they were of a higher quality than non-signed (https://doi.org/10.1192/bjp.176.1.47).

    There are some interesting suggestions of ways to improve or innovate in this space, and I’d love to see more work on these (e.g. some suggestions are offered in https://f1000research.com/articles/6-1151/v3).

    Like

    Reply

Comment on this post:

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.