Are two years’ data better than one?

Photo: Two giraffes by Vera Kratochvil, released to public domain, via Two giraffes are definitely better than one.

Ecologists are perennially angst-ridden about sample size.  A lot of our work is logistically difficult, involves observations on large spatial or temporal scales, or involves rare species or unique geographic features.  And yet we know that replication is important, and we bend over backwards to achieve it.

Sometimes, I think, too far backward, and this can result in wasted effort.

Which brings me to the oddness of the number two.  I’m sure you’ve read this study, if you haven’t written it yourself: some experiment or some observation is conducted, and then repeated in a second year.  Or perhaps the experiment is replicated at two different study sites, or in two plots at a site, or for two different species. (I’ve done this myself; as just one example, this study is replicated at two different sites, although only in a single year.)  And yet: two seems, in general, like a really dumb choice.

If I run an experiment in two different years (I’ll stick to “years”, but the argument is identical for sites, species, etc.), I might be thinking about years in one of two different ways*.  First, I might be aware of some interesting difference between the two years – for instance, one was a drought year and the other a “normal” year.  Second, I might merely be aware that the world is a variable place, and have some expectation that the outcome of the experiment might vary between any two years.  These ideas about year-to-year differences are treated differently, statistically – but in both cases, it seems like two is a dumb number.

In the first case, “year” is what’s known as a fixed effect.  I’m explicitly interested in the contrast between the drought year and the normal year, and I’d like to estimate the effect of drought on whatever we’re measuring.  But of course my two years give me one drought year and one normal year.  With enough replication within years, I can get a really good estimate of the difference between those years – but I can’t ascribe that difference to drought (or anything else) because there’s no replication at the level of drought vs. normal.  So two is really no better than one – despite being twice as much work and expense.

In the second case, “year” is what’s known as a random effect.  I’m conceptualizing different years as essentially random draws from a larger universe of years that I might have studied, with the idea that there’s variation among years in the outcome of my experiment.  This time, I want to estimate the among-year variance component (wanting to know, perhaps, whether it’s big enough that no single-year study means much).  But estimating variances is hard, and estimates of variance based on two data points are very close to meaningless.  So again, two is really no better than one.

And yet, sample sizes of two (years, plots, sites, species, etc.) are abundant in our literature**.  Why?  Not, I think, because we’re collectively unaware of the futility of two.  I doubt that anything I’ve written so far has surprised you.  My best guess is that despite two’s statistical futility, it nonetheless plays a useful role in an effective publishing strategy.  We all know that reviewers will question “unreplicated” studies, and so we repeat an experiment in two years simply to pre-empt that criticism.  Reviewers and editors tend to be satisfied by two years, not because they really think it improves our inference, but because it (1) brings a study into alignment with common literature practice, and (2) gives them enough cover that they don’t have to think of themselves as approving an “unreplicated” study.  We’re all parties to this unspoken social contract: we agree to pretend that two is enough, and we agree to ignore all the other axes along which the study might be “unreplicated”.  (Did we repeat it with two genotypes?  In two different months in each year?  With two different brands of fertilizer?  Using two different capture or marking techniques?  I could go on; but you’d probably prefer I didn’t.)  We implicitly agree on two because it smooths our publishing interactions***.

Is this cynical?  Yeah.  Am I worried that my next manuscript may be rejected by a reviewer who points out we’ve only done two years, and cites this post?  Absolutely.

© Stephen Heard ( July 25, 2017

*^If I’m “thinking” about years at all, but I guess part of my point is that I’m probably not, really.

**^I was starting to sift through my own papers to see how many times I’ve done this myself.  But before I got far, I realized that I’d rather not know. I know, that makes me a bad person.

***^If you watch The Big Bang Theory, you’ll know that Sheldon can object to some action because it’s objectively and logically silly (gift-giving, for instance).  But if he’s told that it’s a “non-optional social convention”, he simply shrugs and climbs happily on board. Two may simply be a non-optional social convention in scientific publishing.


16 thoughts on “Are two years’ data better than one?

  1. Emilie Champagne (@MissEmilieC)

    I completely agree with you Steve, two is generally not better than one. But I think you might have overlook one reason for the 2-years studies that is less cynical: it’s the number of field season a master degree student will usually do. At least, in my experience, I know quite a lot of master degree student in my lab who did and published studies with 2 years of data. I certainly am one. Even in my PhD, 2 years was all that I have to get data.

    Liked by 3 people

    1. Peter Apps

      Not really – you can get a really good straight line between two points, or if you force through the origin you only need one point !


  2. Ambika Kamath

    My first independent field project, I had two treatments, two sexes, and two sites per treatment (thankfully just one year though), and come analysis-time, I was bashing my head against the wall. Since then, I’ve really tried to avoid two, but I bet it’ll happen again 🙂

    Liked by 1 person

      1. Ambika Kamath

        For sure! It was indeed mostly two sites that led to the head bashing, but since it was a question of dividing a finite amount of effort, I regretted not sticking to either two treatments or two sexes…


  3. Manu Saunders

    Great post, completely agree. I’ve mostly worked with fruit orchard systems – some fruit varieties have biennial bearing, which creates a similar issue as your drought/normal years. Ideally, you’d sample for 3+ years, but with the short contract timeframes of phd/postdoc projects that’s impossible. Luckily for most of the ecological relationships I focus on, better spatial replication within 1 year can be just as informative. But one day, when I have a permanent job…..


  4. Pingback: Friday links: the Great Emu War, world’s oldest bar graph, 10 principles of ecology, and more | Dynamic Ecology

  5. roots & rhizomes

    I agree that you cannot test neither fixed nor random effect of a year (or site) in the case you are describing but you definitely can see whether you experimental set up is providing the same results in the two years (or sites). If you use only one year (or a site) you are tempted to over-interpret your results. We used two meadows and two years for one experiment and despite problems with referees (why not more than two?) I was happy for this decision: in both meadows and in both years plant response to treatments was different! Although our results were not easily publishable we learned a lot about the system.


    1. ScientistSeesSquirrel Post author

      Yes, I agree that you can ask whether things are the same at your two sites/years. But that’s about it. You can’t ask why (i.e. associate this with any characteristics of the sites/years); you can’t ask whether that variation is typical among years/sites. (Roughly, those two questions map to fixed/random treatment of sites/years). So – is what you learn from the second site/year worth the effort? I still find it hard to see why! And that’s coming from someone who spent the whole weekend working on a 2-year dataset 🙂


      1. roots & rhizomes

        Frankly, to test properly reasons why the years or sites differ would need 5x or 10x more effort and this is nearly never possible – therefore to know that your precious results are year or site specific is very valuable…. I consider false discovery as more dangerous for our understanding of an ecological phenomenon than acknowledgement of complexity of the system in question.

        Liked by 1 person

  6. Pavel Dodonov

    …Yes! I’ve always thought about this, and now I have something to tell people to read, so thanks!
    Another reason for the number two is that sometimes the funding agency reviewers request additional sampling, even though it is pointless. I know people who had to double their effort, or perhaps include the dry season (when species are less active and less detectable, so what’s the point?) to satisfy the funding agency.
    I once performed a study* with two campaigns. It was a nest predation experiment, each campaign lasted for two weeks and we had about one month between them. I think the reasons for the two campaigns were that 1) to include both the period when the birds are arriving and when they have already arrived and 2) to safeguard against some demonic intrusion which could end the experiment – we had a limited number of nests to place, and if there was, e.g., a fire during the first campaign we would still be able to obtain some data. Luckily, campaign was not important in the final analysis.
    As another example, I have a friend comparing two fishermen communities, one located close to a protected area and another located far from it. It’s not true replication, but, considering the difficulties involved in social research of this kind, I think it sort of makes sense.
    But in general, I agree: wasted effort which does not really give better results.
    * Here’s the study, if someone’s curious: I’m sort of proud of it.


  7. Pingback: Quando dois é pior do que um, ou: Hurlbert avisou! – Mais Um Blog de Ecologia e Estatística

  8. Pingback: Negative-news bias and “the disaster that is peer review” | Scientist Sees Squirrel

Comment on this post:

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.