In Good uses for fake data (part 1), I expounded on the virtues of fake – or “toy” – datasets for understanding statistical analyses. But that’s not the only good use for fake data. Fake data (this time, maybe a better term would be “model data”) can also be extremely useful in planning and writing up research. Once again, let me assure you that I’m – of course – not advocating data fakery for publication! Instead, fake data can help you think through how you’re going to present and interpret results of an experiment or an analysis (or perhaps, even if you can interpret them), before you actually spend effort getting data in hand.
I’d put this in the context of “early writing”, which is a strategy that interweaves the writing of science with the doing of science – as opposed to doing the science first and writing it up when it’s “done”, which always seemed to me so obvious I never thought to question it. Early writing makes writing easier, and can help you spot problems with your work’s design before it’s too late. Continue reading