Negative results, conferences, and the open file drawer

I had more than my usual dose of conferences last summer (as you might have noticed).  After four major conferences in three months, something finally sunk in to my tired, tired brain: conferences tell a very different story than journals.  In particular, conference talks are loaded with negative results – far more so than our journals*.  So is this a problem?  An opportunity?  Both?

My first thought was that I was getting a glimpse of the open file drawer.  Everyone is familiar with the “file drawer problem” – it’s easier, and more fun, to publish a compelling positive result than a negative one.  So all those experiments yielding nonsignificant ANOVAs, all those observations with shotgun-blast non-correlations, end up in the file drawer, and our understanding of the world is slanted by the resulting filter.  Aha, I thought – comparing conference talks with published papers could measure the file drawer; and publishing conference-presented results could prop the file drawer open so the light can shine in there.  Even better, people who register for conferences before completing data analysis could be thought of as having preregistered their studies.  Brilliant!

Well, not so fast.  I thought a little more, and of course it isn’t that easy.

First, the negative-result mismatch may not be a problem.  There are several reasons for the pattern I’ve noticed, and they don’t all involve the file drawer.  Imagine that I present some negative results at a conference, but don’t publish them – or I later publish an analysis with a positive result instead.  If this is because I gave that easy negative-results talk but then dropped the data and forgot about it, then yes, that’s the file drawer.  (Or if I kept trying new analyses until finally I got P<0.05, that’s P-hacking.)  But conferences are good places for pilot data and proof-of-concept talks, too.  Sometimes I might present negative results because only some of my data are in; later, with a larger sample size and the power that results, a real pattern may emerge.   Sometimes I might present negative results because I’ve only done a quick-and-dirty analysis; with more time, I can use more sophisticated statistics to pick apart complexity and reveal the real pattern.  In either case the negative result is best thought of as incomplete or provisional, a placeholder rather than the definitive answer that must either be published or consigned to the file drawer**.

Second, using conferences to measure the file drawer may not be an opportunity (or at least, not an easy one).  Partly that’s a consequence of my argument in the last paragraph: if many negative-result talks aren’t really examples of the file drawer, drawing useful comparisons of negative-result frequencies from conferences and journals will be very difficult.  But it’s also because there’s likely to be an observer effect, in that attention drawn to those negative results is likely to change our willingness to present them.  Doing much with conference negatives would, I think, require that they be published somehow.  In some fields (such as engineering and computer science), conference abstracts are routinely published (and I would speculate that tentative and negative results are less commonly represented).  In my own field, some conferences put abstracts online with the programme; other conferences list only titles.  What would happen if we required abstracts with results, and made them searchable?  Again, speculation; but I’d be surprised if we didn’t see a change in the kind of presentations on offer.  Those with negative results might be less likely to submit them; and those with no results at all (yet) would not be able to.  Either shift would compromise the measurement of the file drawer problem – the very measurement I suggest depends on mandatory deposit of abstracts in the first place***.

So: darn it. I thought I had a brilliant idea, but like a lot of the times I think that, it didn’t hold up well to scrutiny.  Of course, it wouldn’t be impossible to study the frequency of negative results at conferences and in the literature; but it would be very difficult.  It would require in-person attendance rather than work with online abstracts, it would need lots of careful judgment calls (to separate pilot studies and quick-and-dirty first cuts from true negatives), and there are probably more complications I haven’t thought of.  Ph.D. thesis in science studies, anyone?

So at next year’s crop of conferences, how should I think about those negative-results talks?  Well, some of them will be true negatives.  But many will be just works in progress, and their presentation represents an opportunity for my colleagues and me to put our heads together – in search of refinements to methods, more powerful analyses, and all the rest.  Nature is complicated.  Our first answer isn’t always (or even usually) the final word; and we’re cleverer together than we are individually.  This makes science fun, and it should make negative-results talks at conferences fun too.

© Stephen Heard (sheard@unb.ca) December 28, 2016


*^I don’t know why I’d never paid attention to this before.  It’s hardly a subtle pattern, and I’m sure all kinds of people have been smarter than me and remarked on it before. But then, it does say “seldom original” right there in the Scientist Sees Squirrel masthead.

**^Actually, I think analyses for conferences are almost always provisional, for three reasons.  First, we’re all excited to talk about our freshest data.  Second, we all overestimate how quickly we can process and analyze our data.  Third, and most importantly, it’s rare for the first version of an analysis to be the definitive one.  The very process of presenting the data in a talk, or writing it up for submission, is likely to spark realizations about better ways to work with the data; or audience members or peer reviewers may suggest more powerful approaches.  This is completely normal.

***^Of course, any individual presenter can upload their abstract, full poster, or talk slides to an archive such as FigShare or F1000Research (here’s the Ecological Society of America’s “channel” at F1000Research, as an example). But I strongly suspect that this voluntary “publication” provokes an even stronger shift in the frequency of negative and incomplete results than mandatory “publication” would.

Advertisements

13 thoughts on “Negative results, conferences, and the open file drawer

  1. David Mellor

    I wonder how many unofficial registries are out there. One that I know of is the Time-Sharing Experiments for the Social Sciences, http://www.tessexperiments.org/ where individual questionnaires are submitted to a common group that runs the study with a large research participant pool. The questionnaires eventually become public, so work like this http://pan.oxfordjournals.org/content/23/2/306 eventually becomes possible, which compared the conducted research to the published findings and found (wait for it…) that the published results were about twice as likely to be “significant” and showed twice the effect size as the unpublished findings. Obviously not surprising, but worth noting as one of the few examples of having an accurate view into the “file drawer” and measuring its “size”. Obviously, just getting our view of reality from the published literature leaves us with a biased and unrepresentative view of the universe.

    Time for a plug! In the next few months, any association (from lab all the way up to academic society) will be able to quickly create their own registry with their own guidelines and incentives for use (obviously it is only valuable if there is some sort of incentive to use it, like access to participants, data, recognition that registered work is more rigorous than unregistered word, etc). https://osf.io/registries/

    Like

    Reply
    1. ScientistSeesSquirrel Post author

      This is great – thanks, David. Although to be clear, the paper you mention doesn’t have anything to do with conferences, correct? It’s another, and likely much better, way to get at the file drawer.

      Having said that – I only read the Abstract, but it seems odd that “final study reported fewer outcome variables than listed in the survey” is reported as being a Bad Thing, and indicative of questionable research practices. To me, that seems like good writing: you want to focus on your story, and not harass your readers with a lot of distracting information. (Obviously this is not black-and-white; you can’t measure 100 things and report the 5 that were significant; but you also shouldn’t report 100 things that turned out irrelevant, only because you measured them.)

      Thanks for the links!

      Like

      Reply
      1. David Mellor

        Correct, nothing to do with conferences, unfortunately.

        The balance between creating a focused story and representing the whole reality is hard to make, but since it is done it is better to do so before knowing the results. Obviously the outcome variables were important enough to include in the original questionnaires, but not important enough after seeing that nothing unexpected happened with them. Making something publishable requires clean narrative and surprising results, even when a more accurate reality would require a messy narrative with lots of routine findings.

        Slides 14-16 on this presentation https://osf.io/u74qf/ contain some key figures of the work, and compare the number of conditions and variables that were in the “registered” works from those that were reported.

        Like

        Reply
        1. ScientistSeesSquirrel Post author

          Yes, a “balance” is exactly right. And it has to be done. Yes, better to do it before knowing the results; but unless your design is very simple (and the world is very simple!) that judgment may not be possible a priori – or put another way, it may really limit the investigator if they are not “allowed” to do any analysis they haven’t thought of beforehand. So preregistration has obvious plusses, but the lack of universal uptake is neither surprising nor unreasonable, I think. But – to be clear, it’s also a really new idea for ecologists, so I haven’t thought extremely hard about it!

          Like

          Reply
          1. David Mellor

            Would you be willing to try it out for your next data collection effort? We use $1000 prizes (https://cos.io/prereg) as incentives to try it once. A real benefit is not to prevent someone from trying out unanticipated analyses, but rather to make more explicit the fact that those analyses (e.g. unexpected moderators) are post-hoc explanations.

            Like

            Reply
            1. ScientistSeesSquirrel Post author

              That’s a good explanation. Would I try it? Maybe. Margaret Kosmala’s post (http://ecologybits.com/index.php/2016/11/16/thoughts-on-preregistering-my-research/) made clear there are really significant costs to doing this, but also some advantages. At least, the costs are really significant when the data are complex – which is often the case for ecologists. Perhaps the best thing I can think of is to use the threat of “preregistration will be difficult” to encourage myself to design experiments/surveys such that the stats will be quick, simple, easy, and unambiguous. Those are the kind of studies that preregistration seems to be made for – although ironically, they may be the very ones for which it is less useful!

              Like

              Reply
  2. Jeff Houlahan

    Steve, why would you think that provisional results would lead to a bias towards negative results at conferences? Wouldn’t it be just as likely that you would present a positive result at a conference and upon more careful consideration conclude the result was negative?

    Like

    Reply
    1. ScientistSeesSquirrel Post author

      That’s certainly possible – say when you later remove a confound. But I was thinking that for a conference presentation (1) you’d have a smaller dataset, and thus less power; and (2) you’d have done a quick-and-dirty analysis, and thus less ability to tease out a real relationship by using covariates etc etc. Both of those suggest more likely a negative relationship first, and a positive one later. But you’re right, it’s not the only possibility!

      Like

      Reply
  3. Jeremy Fox

    It’s not just a file drawer issue. Some conference presentations might end up getting published with different hypotheses than they originally had, as authors retrofit their hypotheses to better match their data. No idea how often this happens in ecology and evolution, but it’s common in some fields. We had an old link on this at Dynamic Ecology (can’t find it now, sorry). Somebody went back to a bunch of business dissertations and compared them to the subsequent papers. A substantial fraction of the papers dropped data from the dissertation, altered and even reversed the dissertation hypotheses, etc. Always so as to improve the match between the hypotheses and the data.

    Like

    Reply
    1. ScientistSeesSquirrel Post author

      I remember that study. And of course that’s somewhat of a grey area too – we drop data all the time to improve our storytelling, but it shouldn’t be to _change_ our storytelling. Ditto, new hypotheses arise during work with the data – that’s normal. But “always to improve the match”, now that’s a problem!

      Like

      Reply
  4. Ken Thompson

    “What would happen if we required abstracts with results, and made them searchable?”

    I suspect this would discourage undergraduate participation in most conferences. The typical undergraduate thesis program is completed after most conference abstracts are due (at least the early bird deadline). If my own experience can be drawn on for generalities, my thesis was nowhere near complete when I registered for CSEE 2014 (although I can’t remember if I had to submit an abstract and a quick search doesn’t reveal one). If I required results to register, I wouldn’t have been able to present for at least another year! That conference was my first, and was very formative for me, and I am very glad that I was able to register for a talk despite not knowing what the results of my studies would be. As it turns out, the results were negative. This didn’t stop me from giving a talk and didn’t stop the paper from being published in a good journal (J Ecol).

    Like

    Reply

Comment on this post:

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s