It’s been amazing, over the last decade, to watch the incoming tide of R swamp every other tool for statistical analysis (at least in my own field, ecology and evolution). I’ve mostly come to accept my new statistical overlord*. But what I don’t understand is R graphics.
R culture is coding culture (as was the SAS culture I cut my teeth on). And that’s great. I want my analyses to run using code, because I can run them over again, either with old datasets or new ones. It’s a great feeling to dig out an old dataset and an old piece of code, and Bob’s-your-uncle there’s the same old result as a jumping-off point for the next step. No reinventing the wheel! But figures? Why would I code for figures? For figures, I want to point and click**.
My grad students are all about figures in R. And my Twitter timeline is full of snarky comments about the superiority of R for figures. Some poke fun at the ugliness of Excel defaults – as if that were somehow relevant to something, and as if Excel was the only other graphics program they’ve heard of, and as if R’s defaults weren’t equally ugly. Others take a different tack: they point to the reproducibility virtue of coding. But why does the production of a figure need to be reproducible? Figures and analyses aren’t the same kind of thing. An analysis is a decision-making, hypothesis-testing instrument, and it absolutely makes sense for my analysis to be reproducible (by me or someone else). A figure, though, is a storytelling device (“data visualization” is the trendy term). A figure is more like a turn of phrase or the organization of a paragraph: it’s an aid to understanding for the reader. What matters is only how one can best produce a figure that’s easy to understand – which means being able to make different kinds of plots quickly and easily, tweak the graphic design, and so on. It’s certainly possible to do this sort of thing via code, but I’ve seen no evidence that it’s easier or faster*** (and plenty of evidence in the other direction).
So I can’t figure out what advantage coding has for making figures, and I can see lots of costs. Now, when everyone is doing thing X and I’m not, I usually assume that I’m missing something. And I’m sure you’ll tell me so in the Replies. But the thing is, I’m skeptical in this case –
because I know one thing: professional graphic designers don’t use code. Those whose careers focus on building effective communication through graphics use point-and-click software written expressly for the manipulation of visual elements. I do too. (EDIT:) Several folks have told me I’m wrong (see masthead) about that crossed-out bit. It’s good to have (constructively) critical readers! It is true (and I glossed over it) that production of dynamically-updated graphics served via the web depends on coding – it probably has to. What surprises me is to be told that many professionals write code to produce static, one-time graphics. I’m going to assume my commenters are right about this, in which case times have certainly changed. You might think this new knowledge would convert me, but I’m afraid that so far it only mystifies me more deeply. I’m perhaps a bit too reassured to learn that a lot of people produce graphics using code (R or otherwise) – but then tweak or polish them in a specialized program like Adobe Illustrator. Perhaps the best message is this: use the tools that work for you. For me, that’s not coding – at least, not for figures. (END EDIT)
© Stephen Heard (firstname.lastname@example.org) May 4, 2017
*^A bit grudgingly, perhaps. I’d be more likely to throw down rose petals if it wasn’t a language with such dreadfully constructed syntax (heresy!), didn’t have such an enormously steep learning curve, didn’t have astonishingly cryptic error messages and unreadable documentation, and didn’t leave so much uncertainty about how we can know its results are actually correct. But all that is a topic for another day.
**^There are lots of point-and-click options. I happen to use Sigmaplot for graphs, PowerPoint for line art, and every now and again a speciality program for something it does well (like TreeView for phylogenetic trees).
***^With the admitted exception that while coding slows you down immensely in making the first figure of a particular sort, you can make that up if you’re making many figures with different data but the same template. But here’s the thing: you mostly shouldn’t be doing that. It’s become common to harass your reader with online supplements containing figure after figure after figure, and to make each one hugely complex with dozens of panels and traces. But this is just an abdication of the writer’s responsibility to find and tell a story.