Like everyone else, I’ve been watching the rise of “generative AI” with both interest and trepidation. (“Generative AI” is software that creates “new” text (ChatGPT) or images (DALL-E) from a user prompt – I’ll explain the quotes on “new” later.) Now, I know only a smattering about how generative AI works, so don’t expect technical insights here. But I’ve noticed an interesting gap between what I think these systems are doing and how people are reacting to them.
My interest in generative AI, especially text generators, is easily explained and probably obvious. Since I was in high school I’ve watched software get very slowly better at imitating the kind of writing humans do with great effort, and the kind of conversational interaction that humans do without a second thought.* The latest round is, superficially, really impressive: it can chatter pleasantly about nothing much, write a poem, program in R,** write an essay about Canadian history, explain linkage disequilibrium, and more. Or at least, it often looks like it can. Continue reading
Here in Canada we’ve just had a federal election. As politics often does, it put on display two kinds of people: those whose thinking has led them to have strong opinions, and those whose strong opinions have led them to stop thinking. I saw a stunningly good example of the latter group, and the amusing story carries a message that applies much more broadly. So here goes. Continue reading
Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars, by Deborah G. Mayo. Cambridge University Press, 2018.
If there’s one thing we can all agree on about statistics, it’s that there are very few things we all agree on about statistics. The “statistics wars” that Deborah Mayo would like to help us get beyond have been with us for a long time; in fact, the battlefield and the armies shift but they’ve been raging from the very beginning. Is inference about confidence in a single result or about long-term error rates? Is the P-value essential to scientific inference or a disastrous red herring holding science back? Does model selection do something fundamentally different from null-hypothesis significance testing (NHST), and if so, what? If we use NHST, is the phrase “nearly significant” evidence of sophisticated statistical philosophy or evil wishful thinking? Is Bayesian inference irredeemably subjective or the only way to convert data into evidence? These issues and more seem to generate remarkable amounts of heat – sometimes (as with Basic and Applied Social Psychology’s banning of the P-value) enough heat to seem like scorched-earth warfare*. Continue reading
This semester, I’m coteaching a graduate/advanced-undergraduate level course in biostatistics and experimental design. This is my lecture on how to present statistical results, when writing up a study. It’s a topic I’ve written about before, and what I presented in class draws on several older blog posts here at Scientist Sees Squirrel. However, I thought it would be useful to pull this together into a single (longish) post, with my slides to illustrate it. If you’d like to use any of these slides, here’s the Powerpoint – licensed CC BY-NC 4.0.
(Portuguese translation here, for those who prefer.)
How should you present statistical results, in a scientific paper?
Photo: Lazy red panda CC 0 via pxhere.com
I’ve just published a paper that had some trouble getting through peer review. Nothing terribly unusual about that, of course, and the paper is better for its birthing pains. But one reviewer comment (made independently, actually, by several different reviewers) really bugged me. It revealed some fuzzy thinking that’s all too common amongst ecologists, having to do with the value of quick-and-dirty methods. Quick-and-dirty methods deserve more respect. I’ll explain using my particular paper as an example, first, and then provide a general analysis. Continue reading
Graphic: A fake regression. You knew those were fake data, right? I may spend my entire career without getting a real regression that tight.
If you clicked on this post out of horror, let me assure you, first off, that it isn’t quite what you fear. I don’t – of course – endorse faking data for publication. That happens, and I agree it’s a Very Bad Thing, but it isn’t what’s on my mind today.
What I do endorse, and in fact encourage, is faking data for understanding. Fake data (maybe “toy data” would be a better term) can help us understand real data, and in my experience this is a tool that’s underused. Continue reading
(Warning: long and slightly wonkish)
If you’re like me, you’re continually frustrated by the fact that undergraduate students struggle to understand statistics. Actually, that’s putting it mildly: a large fraction of undergraduates simply refuse to understand statistics; mention a requirement for statistical data analysis in your course and you’ll get eye-rolling, groans, or (if it’s early enough in the semester) a rash of course-dropping.
This bothers me, because we can’t do inference in science without statistics*. Why are students so unreceptive to something so important? Continue reading
(graphic by Chen-Pan Liao via wikimedia.org)
The P-value (and by extension, the entire enterprise of hypothesis-testing in statistics) has been under assault lately. John Ioannadis’ famous “Why most published research findings are false” paper didn’t start the fire, but it threw quite a bit of gasoline on it. David Colquhoun’s recent “An investigation of the false discovery rate and the misinterpretation of P-values” raised the stakes by opening with a widely quoted and dramatic (but also dramatically silly) proclamation that “If you use P=0.05 to suggest that you have made a discovery, you will be wrong at least 30% of the time.”* While I could go on citing examples of the pushback against P, it’s inconceivable that you’ve missed all this, and it’s well summarized by a recent commentary in Nature News. Even the webcomic xkcd has piled on. Continue reading