Warning: mostly trivial.
I have several friends who are ready to die on the hill that’s the plurality of “data”. Writing “the data suggests” or “the data is strong”, for these folks, isn’t just wrong: it’s a crime against the sanctity of the English language, and a grievous insult to right-thinking scholars everywhere. And for some reason (probably because they know I wrote a book about writing), these particular friends turn to me for backup. But here’s the thing: once, I was on their side; but I’ve thrown in the towel.
Sure, “data” is etymologically plural – we borrowed both a singular form (datum) and a plural form (data) from Latin. The case for “data are” rests on this, as well as the notion that there are a bunch of singular “datum”s added together to make some “data”. This case would be more convincing if we ever used the word “datum” (other than meaning a fixed reference point in space, in the surveying sense). It would be more convincing if we insisted with similar vehemence that “agenda” is plural (Latin, one “agendum”, two “agenda”)*. And it would be more convincing if English words had to mean what their etymologies or their components suggest (what kind of fly is a butterfly, and is it made of butter? And don’t get me started on baby oil).
Maybe we could make and enforce a rule that “data” is a plural noun, if we had some kind of system for making and enforcing rules in English. We don’t. The meanings of English words are simply conventions between writers and readers, speakers and listeners. They shift in time (“nice” once meant “stupid”, but since someone called me a nice guy last week, I sure hope it doesn’t any more); they vary in space; and they differ among social groups. Sometimes a convention is broadly agreed-upon and lasts for a long time, and such stable words may be trouble-free. But for words, or at times, or in places where a convention is in flux, we get the furious arguments that go along with “data is”.**
Given the way English works, it isn’t necessary to worry about whether treating “data” as a singular makes logical sense – if enough people treat it that way, it just is; and we’re pretty much there. But perhaps it helps to realize that the logical case for “data is” is good. Plenty of English nouns are non-count (or “mass”) nouns: they’re treated as singular even though they refer to a collection of theoretically countable units. Consider “sand”, for instance, or “rice”, “equipment”, “information” (with its nice parallel to “data”), and many more.*** Thinking of “data” as a non-count noun suggests a view in which inference is normally based on data in aggregate. This makes sense: we don’t check whether each individual data point is consistent with a hypothesis, or at least, we shouldn’t.
So along with a sizeable majority of our planet’s denizens, I’m perfectly OK with “data is”. Bt in case you’re thinking of me as a grammatical old softie, I’m not ready to throw in the towel on “criteria”. Seeing “one criteria is…” makes me boil inside. I think I’m on solid ground there, actually, partly because our “criterion/criteria” convention hasn’t weakened far enough yet to make “one criteria” something other than an error, and partly because I don’t see a case for “criteria” as a non-count noun to justify a convention shift. One day, I may have to throw in this towel too, because that’s just how English works. But for now, one “criterion” and two “criteria”, thank you very much.
Data is, data are: where do you stand?
© Stephen Heard January 21, 2020
Image: The Wrong Hill to Die On, by Donis Casey. OK, there’s no connection other than that I liked the cover and the title went with the post. I haven’t even read it… but I’m going to.
*^No, that’s not a one-off example. Consider the following singular English nouns, all of which are plural in their source languages: “biscotti”, “candelabra”, “opera”, “zucchini”. By the way, the Wikipedia article on English plurals is absolutely (if unexpectedly) fascinating.
**^I think we’re pretty much done gnashing our teeth over the singular “they”, which is a very good thing.
***^Interestingly, languages don’t necessarily agree on which nouns are non-count nouns. Both “equipment” and “information” are count nouns in French, so it makes perfect sense to say “certains équipements” and “plus d’informations”.
Makes perfect sense to me. Language is constantly changing and we need to roll with it. But I can’t seem to get over the “misuse” of the apostrophe…
LikeLiked by 2 people
Ugh, yes, agreed. Despite the fact that it’s rarely a problem in terms of clarity of meaning, it irritates me no end…
LikeLike
Data are sacrosanct 🙂 Another hill that I am willing die on is fewer or less 🙂
LikeLiked by 1 person
Oh Simon 🙂 Make you a deal: how about we each consider the other one wrong, but rather harmlessly so?
LikeLiked by 1 person
but of course 🙂
LikeLike
I’ll die on the fewer or less hill
LikeLiked by 1 person
Won’t correct anyone for using “data” as singular, but I wouldn’t do it myself. Same with “medium/media”. My personal writing rules =/= anyone else’s rules
LikeLike
I am dying on that hill! Probably already died there…
…data is data is data is!
Countables & non-countables. It is a collective thing, data, something accumulated by the data-gatherer, like a pile of nuts gathered by a squirrel. In contrast, a government is a singular thing – it acts as a body, so ‘the government says’ not ‘the government say’…
Cheese is an example I like, that can be countable & non-countable – well, I like ‘some cheeses’ but not ‘all cheese.’ Of course, I am only ‘one data point’ – !!!
I also, while we are here, blame ‘CSI’ on TV for getting all scientists to start there exposition with ‘so’ – rather than the older filler-word ‘well’… arrrgh!
🙂
LikeLike
So then clearly you believe that “rice” is a plural, and you use “ricum” for a single grain? 🙂
LikeLike
arrgh! THEIR!!! am I an idiot?
sadly yes 😦
I will not riseum to the baitum!
ricin? No – that is taken…
🙂
LikeLike
Ha, I didn’t even notice your ‘their’ slip! Forgiven 🙂
LikeLike
You can also blame Grissom in CSI for perpetuating the myth that the Tequila worm is an annelid and not the lepidopteran larva that it actually is 🙂
LikeLiked by 1 person
Anyone one want to take a bet on how long it will be before “datas” becomes the plural ?
LikeLiked by 2 people
Add my towel to the pile. As long as we don’t start accepting “would of” instead of “would have” I will be OK….
LikeLike
I don’t get my undies in a bundle over this one. I know what the author means either way.
LikeLike
About the fewer vs less “rule.” This blog post talks about its history.
“No matter that the leading authorities of our own time point out that it’s a fake rule; fakes are often more popular than the real things.”
https://sesquiotic.com/2019/07/07/one-fewer-thing-to-fuss-about/
LikeLiked by 1 person
You wouldn’t want to deprive me my shouting at the radio or TV when I hear people using them ‘incorrectly’ would you ? 🙂
LikeLike
Oh, I fervently do want to deprive you of that! 🙂
This peever and stickler list might make your head explode.
https://goo.gl/kOv398
LikeLike
My wife will go ballistic 🙂
LikeLike
I find stratum/strata important in epidemiology, so find it easier to stick with data are
LikeLike
Pingback: A year of books (5): where did the summer go? | Scientist Sees Squirrel