Warning: mostly trivial.
I have several friends who are ready to die on the hill that’s the plurality of “data”. Writing “the data suggests” or “the data is strong”, for these folks, isn’t just wrong: it’s a crime against the sanctity of the English language, and a grievous insult to right-thinking scholars everywhere. And for some reason (probably because they know I wrote a book about writing), these particular friends turn to me for backup. But here’s the thing: once, I was on their side; but I’ve thrown in the towel.
Sure, “data” is etymologically plural – we borrowed both a singular form (datum) and a plural form (data) from Latin. The case for “data are” rests on this, as well as the notion that there are a bunch of singular “datum”s added together to make some “data”. This case would be more convincing if we ever used the word “datum” (other than meaning a fixed reference point in space, in the surveying sense). It would be more convincing if we insisted with similar vehemence that “agenda” is plural (Latin, one “agendum”, two “agenda”)*. And it would be more convincing if English words had to mean what their etymologies or their components suggest (what kind of fly is a butterfly, and is it made of butter? And don’t get me started on baby oil).
Maybe we could make and enforce a rule that “data” is a plural noun, if we had some kind of system for making and enforcing rules in English. We don’t. The meanings of English words are simply conventions between writers and readers, speakers and listeners. They shift in time (“nice” once meant “stupid”, but since someone called me a nice guy last week, I sure hope it doesn’t any more); they vary in space; and they differ among social groups. Sometimes a convention is broadly agreed-upon and lasts for a long time, and such stable words may be trouble-free. But for words, or at times, or in places where a convention is in flux, we get the furious arguments that go along with “data is”.**
Given the way English works, it isn’t necessary to worry about whether treating “data” as a singular makes logical sense – if enough people treat it that way, it just is; and we’re pretty much there. But perhaps it helps to realize that the logical case for “data is” is good. Plenty of English nouns are non-count (or “mass”) nouns: they’re treated as singular even though they refer to a collection of theoretically countable units. Consider “sand”, for instance, or “rice”, “equipment”, “information” (with its nice parallel to “data”), and many more.*** Thinking of “data” as a non-count noun suggests a view in which inference is normally based on data in aggregate. This makes sense: we don’t check whether each individual data point is consistent with a hypothesis, or at least, we shouldn’t.
So along with a sizeable majority of our planet’s denizens, I’m perfectly OK with “data is”. Bt in case you’re thinking of me as a grammatical old softie, I’m not ready to throw in the towel on “criteria”. Seeing “one criteria is…” makes me boil inside. I think I’m on solid ground there, actually, partly because our “criterion/criteria” convention hasn’t weakened far enough yet to make “one criteria” something other than an error, and partly because I don’t see a case for “criteria” as a non-count noun to justify a convention shift. One day, I may have to throw in this towel too, because that’s just how English works. But for now, one “criterion” and two “criteria”, thank you very much.
Data is, data are: where do you stand?
© Stephen Heard January 21, 2020
Image: The Wrong Hill to Die On, by Donis Casey. OK, there’s no connection other than that I liked the cover and the title went with the post. I haven’t even read it… but I’m going to.
*^No, that’s not a one-off example. Consider the following singular English nouns, all of which are plural in their source languages: “biscotti”, “candelabra”, “opera”, “zucchini”. By the way, the Wikipedia article on English plurals is absolutely (if unexpectedly) fascinating.
***^Interestingly, languages don’t necessarily agree on which nouns are non-count nouns. Both “equipment” and “information” are count nouns in French, so it makes perfect sense to say “certains équipements” and “plus d’informations”.