Category Archives: statistics

Presenting statistical results: pointers in scientific writing

This semester, I’m coteaching a graduate/advanced-undergraduate level course in biostatistics and experimental design.  This is my lecture on how to present statistical results, when writing up a study.  It’s a topic I’ve written about before, and what I presented in class draws on several older blog posts here at Scientist Sees Squirrel.  However, I thought it would be useful to pull this together into a single (longish) post, with my slides to illustrate it.  If you’d like to use any of these slides, here’s the Powerpoint – licensed CC BY-NC 4.0.

(Portuguese translation here, for those who prefer.)

Here goes.

How should you present statistical results, in a scientific paper?
Continue reading

Advertisements

From ego naming to cancer screening: type I error and rare events

Image: Unicorn fresco by Domenichino (1581-1641), in the Palazzo Farnese, Rome, via wikimedia.org

Sometimes, thinking about science, I make odd connections.  Often, they seem odd when I first make them, but then I learn something important from them and wonder why I’d never made them before.  A good example cropped up the other day, when I realized that a peculiar feature of the scientific naming of organisms connects, via some simple statistics, to the difficulty of cancer screening, to reproducibility, and to the burden of proof for surprising claims.  Curious?  Here goes. Continue reading

The efficiency of the lazy scientist

Photo: Lazy red panda CC 0 via pxhere.com

I’ve just published a paper that had some trouble getting through peer review.  Nothing terribly unusual about that, of course, and the paper is better for its birthing pains.  But one reviewer comment (made independently, actually, by several different reviewers) really bugged me.  It revealed some fuzzy thinking that’s all too common amongst ecologists, having to do with the value of quick-and-dirty methods.  Quick-and-dirty methods deserve more respect.  I’ll explain using my particular paper as an example, first, and then provide a general analysis. Continue reading

Statistics in Excel, and when a Results section is “too short”

Every now and again, you see a critique of a manuscript that brings you up short and makes you go “Huh”.

A student of mine defended her thesis a while ago, and one of her examiners commented on one of her chapters that “the Results section is too short”*Huh, I said.  Huh.

I’m quite used to seeing manuscripts that are too long.   Occasionally, I see a manuscript that’s too short.  But this complaint was more specific: that the Results section in particular was too short. I’d never heard that one, and I just couldn’t make sense of it.  Or at least, not until I realized that it fits in with another phenomenon that I see and hear a lot: the suggestion that nobody should ever, ever do their statistics in Excel. Continue reading

The statistics of pesto

Image: This is what 1300 g (2.8 lb) of basil looks like. 

Yesterday (as I write) I bought some basil at my local farmer’s market.  Quite a lot of basil, actually – almost 3 pounds of it – because it was my annual pesto-making day*.  My favourite vendor sells basil by the stem (at 50¢ each), and I started pulling stems from a large tub.  Some stems were quite small, and some were huge, with at least a five-fold difference in size between smallest and largest (and no, I didn’t get to just pick out the huge ones).  So how many stems did I need?  Or to put it the other way around, given that I bought 49 stems, how many batches of pesto would I be making, and how many cups of walnuts would I need?

My undergrad students – like a lot of biology students – don’t like statistics.  Continue reading

Are two years’ data better than one?

Photo: Two giraffes by Vera Kratochvil, released to public domain, via publicdomainpictures.net. Two giraffes are definitely better than one.

Ecologists are perennially angst-ridden about sample size.  A lot of our work is logistically difficult, involves observations on large spatial or temporal scales, or involves rare species or unique geographic features.  And yet we know that replication is important, and we bend over backwards to achieve it.

Sometimes, I think, too far backward, and this can result in wasted effort. Continue reading

Two tired misconceptions about null hypotheses

Comic: xkcd #892, by Randall Munroe

 For some reason, people seem to love taking shots at null-hypothesis/significance-testing statistics, despite its central place in the logic of scientific inference.  This is part of a bigger pattern, I think:  it’s fun to be iconoclastic, and the more foundational the icon you’re clasting (yes, I know that’s not really a word), the more fun it is.  So the P-value takes more than its share of drubbing, as do decision rules associated with it.  The null hypothesis may be the most foundational of all, and sure enough, it also takes abuse.

I hear two complaints about null hypotheses – and I’ve been hearing the same two since I was a grad student.  That’s mumble-mumble years listening to the same strange but unkillable misconceptions, and when both popped their heads up again within a week, I gave myself permission to rant about them a little bit.  So here goes. Continue reading