Why I threw in the towel on “data is”

Warning: mostly trivial.

I have several friends who are ready to die on the hill that’s the plurality of “data”.  Writing “the data suggests” or “the data is strong”, for these folks, isn’t just wrong: it’s a crime against the sanctity of the English language, and a grievous insult to right-thinking scholars everywhere.  And for some reason (probably because they know I wrote a book about writing), these particular friends turn to me for backup.  But here’s the thing: once, I was on their side; but I’ve thrown in the towel.

Sure, “data” is etymologically plural – we borrowed both a singular form (datum) and a plural form (data) from Latin.  The case for “data are” rests on this, as well as the notion that there are a bunch of singular “datum”s added together to make some “data”.  This case would be more convincing if we ever used the word “datum” (other than meaning a fixed reference point in space, in the surveying sense).  It would be more convincing if we insisted with similar vehemence that “agenda” is plural (Latin, one “agendum”, two “agenda”)*.  And it would be more convincing if English words had to mean what their etymologies or their components suggest (what kind of fly is a butterfly, and is it made of butter? And don’t get me started on baby oil).

Maybe we could make and enforce a rule that “data” is a plural noun, if we had some kind of system for making and enforcing rules in English.  We don’t.  The meanings of English words are simply conventions between writers and readers, speakers and listeners. They shift in time (“nice” once meant “stupid”, but since someone called me a nice guy last week, I sure hope it doesn’t any more); they vary in space; and they differ among social groups.  Sometimes a convention is broadly agreed-upon and lasts for a long time, and such stable words may be trouble-free.  But for words, or at times, or in places where a convention is in flux, we get the furious arguments that go along with “data is”.**

Given the way English works, it isn’t necessary to worry about whether treating “data” as a singular makes logical sense – if enough people treat it that way, it just is; and we’re pretty much there. But perhaps it helps to realize that the logical case for “data is” is good.  Plenty of English nouns are non-count (or “mass”) nouns: they’re treated as singular even though they refer to a collection of theoretically countable units.  Consider “sand”, for instance, or “rice”, “equipment”, “information” (with its nice parallel to “data”), and many more.***  Thinking of “data” as a non-count noun suggests a view in which inference is normally based on data in aggregate.  This makes sense: we don’t check whether each individual data point is consistent with a hypothesis, or at least, we shouldn’t.

So along with a sizeable majority of our planet’s denizens, I’m perfectly OK with “data is”. Bt in case you’re thinking of me as a grammatical old softie, I’m not ready to throw in the towel on “criteria”.  Seeing “one criteria is…” makes me boil inside.  I think I’m on solid ground there, actually, partly because our “criterion/criteria” convention hasn’t weakened far enough yet to make “one criteria” something other than an error, and partly because I don’t see a case for “criteria” as a non-count noun to justify a convention shift.  One day, I may have to throw in this towel too, because that’s just how English works.  But for now, one “criterion” and two “criteria”, thank you very much.

Data is, data are: where do you stand?

© Stephen Heard  January 21, 2020

 Image: The Wrong Hill to Die On, by Donis Casey. OK, there’s no connection other than that I liked the cover and the title went with the post.  I haven’t even read it… but I’m going to.

*^No, that’s not a one-off example.  Consider the following singular English nouns, all of which are plural in their source languages: “biscotti”, “candelabra”, “opera”, “zucchini”.  By the way, the Wikipedia article on English plurals is absolutely (if unexpectedly) fascinating.

**^I think we’re pretty much done gnashing our teeth over the singular “they”, which is a very good thing.

***^Interestingly, languages don’t necessarily agree on which nouns are non-count nouns.  Both “equipment” and “information” are count nouns in French, so it makes perfect sense to say “certains équipements” and “plus d’informations”.

21 thoughts on “Why I threw in the towel on “data is”

  1. jlundholm1970s

    Makes perfect sense to me. Language is constantly changing and we need to roll with it. But I can’t seem to get over the “misuse” of the apostrophe…

    Liked by 2 people

  2. Char

    Won’t correct anyone for using “data” as singular, but I wouldn’t do it myself. Same with “medium/media”. My personal writing rules =/= anyone else’s rules


  3. H Stiles (@HStiles1)

    I am dying on that hill! Probably already died there…
    …data is data is data is!
    Countables & non-countables. It is a collective thing, data, something accumulated by the data-gatherer, like a pile of nuts gathered by a squirrel. In contrast, a government is a singular thing – it acts as a body, so ‘the government says’ not ‘the government say’…
    Cheese is an example I like, that can be countable & non-countable – well, I like ‘some cheeses’ but not ‘all cheese.’ Of course, I am only ‘one data point’ – !!!

    I also, while we are here, blame ‘CSI’ on TV for getting all scientists to start there exposition with ‘so’ – rather than the older filler-word ‘well’… arrrgh!


  4. Pingback: A year of books (5): where did the summer go? | Scientist Sees Squirrel

Comment on this post:

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.