A new preprint, author contributions, and the best kind of collaboration

We’ve just posted a new preprint! Like our recent funny-titles study, it’s a pandemic pivot project. Like our funny-titles study, it’s a little weird – but also exciting. I’ll tell you a bit about the preprint, and then use it to make a point about collaborations.

Have you ever wondered if names only label things, or if they also influence the way we think about those things? Actually, I hope you’re well past wondering, because there’s extensive evidence that it’s the latter: for example, names are one source of unconscious bias in how we think about each other.  But what about the Latin (scientific) names of species? Could the scientific name of a species influence the kind of research that gets done on it?

Told you it was a weird one. Surely not, right? The job of a scientific name is to label a species so we can refer to it unambiguously.* Scientists are too objective and too smart to let that name then determine whether or not they study that species, right?

Apparently not.

Here’s what we did. We focused on plant-feeding arthropods (insects and mites) and on studies of host-associated differentiation (HAD). HAD** is an evolutionary process of dietary specialization, by which a once-generalist species diverges into two (or more) specialist races or species. For example, an insect that once fed on two goldenrod species might give rise, via HAD, to a pair of closely related but genetically distinct races, one feeding on each of the two goldenrods. Evolutionary ecologists are very interested in this process, because plant-feeding arthropods are astonishingly diverse and often tightly specialized, and HAD may be a big part of the group’s diversification.  But there are hundreds of thousands of plant-feeding species that we might study, to ask whether or not they’ve undergone HAD.*** As a field, we’ve studied just a tiny subset of those.  Which ones? Could whether or not we’ve studied a species depend on its name?

We compiled a list of 30 insect and mite species that have been tested for HAD, and then we compiled a much longer list of all the species in their genera (about 2700 of them, in total). For each, we recorded the etymology of the species’ Latin name. Some names derive from a geographic place (Oxya japonica); some from morphology (Thrips alatus); some are eponymous (Diplolepis mayri); and some refer to the plant the species attack (Eurosta solidaginis, which attacks Solidago.) We asked whether species named for their hosts were more likely to have been tested for HAD – by flipping the question over, and asking whether species tested for HAD were more likely to be named for their hosts.  And they were! 47% of HAD-tested species have names based on their host plants; but only 23% of the other species do.

This is a startling result, because we’re not used to thinking this way. How scientists decide what to study is sort of mysterious, but I think we generally assume that it’s a rational process. We study a species that offers a good test of our hypothesis, or one that lives in an accessible location, or one that’s economically important. Our study suggests there’s more to it: we may be unconsciously influenced by what species are called. Actually, I don’t think that’s implausible. I started studying HAD with the species Gnorimoschema gallaesolidaginis, which makes galls on Solidago. I had noticed the galls, looked up what they were, and wondered if it was a generalist on multiple Solidago species or a complex of specialists. Did the species’ name pique my interest in its relationship with its host? Good question.

I promised a point about collaborations and author contributions. At the same time as I posted the preprint, I also submitted the manuscript to a journal.**** Part of the submission process involved listing the contributions different authors had made. I was lucky with this project: I got to work with some really fabulous scientists (and people): Julia Mlynarek, Chloe Cull, Jess Vickruck, and Amy Parachnowitsch. (As a side note, here’s some career advice: find smart, kind collaborators and everything is better.)

I found it easy enough to list who had curated the data, done the analysis, or contributed to writing; but being asked about about study design and conception really made me think.  The idea for the project goes back a long way. Julia and I had been talking about her leaf-miner HAD data and how lots of leaf miners are named for their host plants. But whose idea was it to ask whether name etymology influences whether HAD gets studied? I don’t know; my guess is mostly Julia, but it was back and forth so much that neither of us is really sure. And who came up with the approach to answer the question? I don’t know that either; my guess is mostly me, but it was back and forth so much that neither of us is really sure. So it’s a good thing that author contribution statements can be pretty vague, because even we can’t offer a straightforward list of our own contributions.

Does it seems strange that we can’t remember whose idea our study was? It won’t, if you’ve been lucky enough to have the very best kind of collaboration. In the very best collaborations, that’s how it works: you bat ideas back and forth, you scribble over each other’s sketches on a whiteboard, you change your minds repeatedly, you argue back and forth until you realize you’re swapped positions and are still arguing. Eventually, it doesn’t even make sense to ask which one of you came up with the idea – but a new idea exists, and it’s one that neither of you could have come up with alone.

As scientists we’re in the business of finding stuff out. We desperately want to know things. But one thing I’m very happy not to know – not to be able to know – is “whose idea was this”? Not knowing that is one thing that makes science fun.

© Stephen Heard  July 5, 2022

Images: The preprint in question; and a gall of Gnorimoschema gallaesoldiaginis on Solidago gigantea, © S Heard, CC BY 4.0

*^The job of a scientific name is not to describe a species. Yes, some scientific names are descriptive, like Setophaga cerulean, the cerulean (blue) warbler. But that’s not a necessary component of scientific names, and here’s why I think those who insist it should be are barking up the wrong tree.

**^Yes, that’s an acronym. Yes, I’ve railed against acronyms before, and more than once, and I’ll probably do so again. HAD is one I’m probably going to stick with.

***^Such a study normally involves collecting individuals of what appears to be a single generalist species, from two or more different plant hosts, and then using genetic markers such as allozymes, restriction fragments, microsatellites, or single-nucleotide polymorphisms to screen for genetic differences associated with the host plant from which the individual was collected. You asked. OK, you didn’t ask. But I told you anyway.

****^Despite arguments that posting preprints allows for feedback before submission to a journal, it rarely works that way – most preprints receive little if any feedback; and realizing that, most authors don’t wait to submit. There’s a fuller discussion of preprinting in Chapter 25 of The Scientist’s Guide to Writing.



9 thoughts on “A new preprint, author contributions, and the best kind of collaboration

  1. baskin2013

    Fascinating study. Thank you. But please drop HAD. I am glad to see that you have railed about acronyms before but your pleas are not so convincing if you don’t follow your own advice.
    When I teach writing to graduate students and point out problems with acronyms, students agree whole heartedly but only about acronyms used by others. Acronyms they use are normal and essential. But no. In the case here for example, I was interested to read about the influence of naming but I am not an evolutionary biologist and I had never come accross the term ‘host-associated differentiation’. Every time I saw ‘HAD’ I had to spend some time back translating the thing. Time that took me away from understanding your points. (That ‘had’ is a common English word compounded the problem).
    Sure, ‘host-assosciated differentiation’ is a mouthful. I suspect there would be times where ‘this kind of differentiation’ or the like would work. Turning words into symbols stops you from thinking about their meaning. Never a good thing. Here is an excercise: spell out all of the HADs in the above and then read the text carefully. I predict you will be surprised and start thinking about your meaning productively.

    Liked by 1 person

    1. ScientistSeesSquirrel Post author

      Thanks for this comment! It’s actually a good illustration of how one weighs when to use an acronym. When drafting, I actually did write out the phrase throughout, and it was REALLY cumbersome. So I somewhat reluctantly kept the acronym. But in light of your comment, maybe I got this wrong. In the preprint itself we use HAD, because it’s become a fairly standard term in itself, in the field. But for a more general readership, like here, the burden of the acronym is larger, and perhaps it outweighs the burden of the repeated long phrase. (I hadn’t thought of the possible confusion with the verb ‘had’ – although English is so replete with words that have two or more meanings, I’m less sure this is a big consideration.) Curious to see if others weigh in here!

      Liked by 1 person

      1. Pavel Dodonov

        I also think that the text would have flown better without the acronym! You have such a nice flowing and readable style and placing an acronym in a blog post sort of breaks it. In the manuscript it’s probably fine if it us a widely used acronym in the field (for instance, I always use “EI” for edge influence although I also always rant about acronyms) but perhaps not so much in a blog post?

        And regarding the main subject of this post, this is a quite interesting study! I think that a species name can make a person curious about something, stimulating said person to study it. So, if names are a source of curiosity, this result doesn’t seem that surprising to me. Or if the name is related to the study question: host association here; or, for example, toponimy in studies of endemic species. For example, there’s an endemic lizard species called Glaucomastix abaetensis, named after a region (which is named after a lagoon) in Bahia, Brazil. It seems to me that there is a decent amount of public attention given to it; and perhaps if it were named differently this attention might be smaller. Something to be tested in a future study, perhaps 🙂


      2. baskin2013

        I think there is an attitude where people in a field take a key concept and acronymize it delibrately. I see this as partly territorial, marking the field, and partly because it is expected. I don’t know where this expectation comes from but it might have arisen early in the last century with biologists having a little bit of physics envy. Physicsts get to use power symbols like 𝛙. But wherever it comes from, I think the impulse needs to be resisted. When you are telling a story about Little Red Riding Hood, you don’t write about LRRH. Sure, she has a mouthful of a name but that is her name. Abbreviating the name of a major character in your story is never a good thing. Being confronted with the akwardness of host-associated differentation might make you question the concept behind the term itself, not a bad thing, or at least take what the words mean seriously.


        1. ScientistSeesSquirrel Post author

          Well, I think maybe we’re going to disagree a bit here. I don’t think I’m failing to “take what the words mean seriously” – I’ve published multiple papers about host-associated differentiation. And I respect my readers enough to think that they aren’t going to fail to think about the concept because it’s abbreviated – any more than abbreviating ANOVA makes people bad statisticians! (Mind you, other things make people bad statisticians…) Now, I’m not saying that you’re wrong that in this particular case spelling out the acronym would make things better. But as much as I’m an anti-acronym crusader (an AAC?), taking the position that all acronyms are always bad and that they are power plays to conceal meaning seems a bit over the top. How many articles or books have you read recently that spell out “self-contained underwater breathing apparatus” each time it’s used? Considering acronyms is always a balance, which should be resolved in the reader’s interest. I’m still not sure what the correct resolution is for THIS piece, but I do know that the correct resolution is not ‘never ever use acronyms ever’ 🙂


          1. baskin2013

            I don’t mean to hijack comments here about your interesting article on the impact of naming. I think about acronyms in writing but I seldom have the chance to disucss this issue with others. I appreciate your thoughtful remarks. By all means shut this off when you see fit.
            Given how clotted papers are with acronyms, I am not sure that going over the top about them is necessarily a bad thing. But I certainly did not mean to imply that there could be no acronyms ever. As you have pointed out certain acronyms are so widely accepted that they have become words in their own right. Indeed, scuba is a good example, so much so that I rarely see it written as SCUBA. And I think it would be perfectly reasonable to write “anova” as a word. Another example is DNA, which is arguably more familiar and even easier to read than deoxyribonucleic acid. Sure.
            But I would also say that those examples are far more familiar than HAD, which I still feel is best avoided even in a journal article. Perhaps the most important reason is that acronyms like that are exclusive. They create an in-crowd, those who have slogged their way through this literature, and an out-crowd, those who have not. Wouldn’t you like people in other fields to be able to read your work? Or how about students, who are just getting started? Why put up a barrier, even a small one, to ready comprehension? Because until the string of letters becomes familiar, you have to translate it to words, which takes mental energy, even when you can call up what the symbol stands for.
            It is true that competing demands need to be balanced and that host-associated differentiation is a little awkward. But to the extent that some of your readers will not be familiar with HAD, I think the balance in this case comes down on the side of spelling the phrase out.
            Thanks for listening and once again sorry to change the subject of the post.


  2. Pingback: So, how’s (semi) retirement going? | Scientist Sees Squirrel

  3. Pingback: On text-mining using Google search tools | Scientist Sees Squirrel

  4. Pingback: Why my newest paper is paywalled | Scientist Sees Squirrel

Comment on this post:

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.