Category Archives: statistics

Two tired misconceptions about null hypotheses

Comic: xkcd #892, by Randall Munroe

 For some reason, people seem to love taking shots at null-hypothesis/significance-testing statistics, despite its central place in the logic of scientific inference.  This is part of a bigger pattern, I think:  it’s fun to be iconoclastic, and the more foundational the icon you’re clasting (yes, I know that’s not really a word), the more fun it is.  So the P-value takes more than its share of drubbing, as do decision rules associated with it.  The null hypothesis may be the most foundational of all, and sure enough, it also takes abuse.

I hear two complaints about null hypotheses – and I’ve been hearing the same two since I was a grad student.  That’s mumble-mumble years listening to the same strange but unkillable misconceptions, and when both popped their heads up again within a week, I gave myself permission to rant about them a little bit.  So here goes. Continue reading

Populations sign for Gross, Nebraska (population 3)

Small-town mayors,functional traits, and the estimation of extremes

Warning: gets a bit wonkish near the end.

Have you ever noticed that the mayor of a small town is fairly often a bonehead?  There’s a simple reason we’d expect that to be true – and that simple reason has implications for academic searches, the traits we analyze in ecology and systematics, and lots of other things, too (please add to my list in the Replies).  The simple reason is this:  it’s really hard to estimate extremes.  It’s also really hard to understand why so many people act as if they’re unaware of this.

Let’s start with those mayors.  Continue reading

Why do we mention stats software in our Methods?

Image: Excerpt from Heard et al. 1999, Mechanical abrasion and organic matter processing in an Iowa stream. Hydrobiologia 400:179-186.

Nearly every paper I’ve ever written includes a sentence something like this: “All statistical analyses were conducted in SAS version 8.02 (SAS Institute Inc., Cary, NC)*  But I’m not quite sure why.

Why might any procedural detail get mentioned in the Methods?  There are several answers to that, with the most common being: Continue reading

The most useful statistical test that nobody knows

Image: Plaque commemorating Fisher on Inverforth House. Peter O’Connor via flickr.com, CC BY-SA 2.0

Do you know Fisher’s method for combining P-values?  If you do, move along; I’ve got nothing for you. If you don’t, though, you may be interested in what’s surely the most useful statistical test that – despite the fame of Fisher himself – nobody knows about.

Fisher’s method is the original meta-analysis.  When I was a grad student, and nobody had heard of meta-analysis (or cell phones, or the internal-combustion engine), I had a supervisory committee member who liked to make strong statements.  One of his favourites was “A bunch of weak tests don’t add up to a strong test!”  Continue reading

Statistics and significant digits

(My writing pet peeves – Part 1)

Image: completely fake “data”, but a real 1-way ANOVA; S. Heard.

I read a lot of manuscripts – student papers, theses, journal submissions, and the like.  You can’t do that without developing a list of pet peeves about writing, and yes, I’ve got a little list*.

Sitting atop my pet-peeve list these days: test statistics, P-values, and the like reported to ridiculous levels of precision – or, rather, pseudo-precision.  I’ve done it in the figure above: F1,42 = 4.716253, P = 0.0355761.  I see numbers like these all the time – but, really? Continue reading

Is “nearly significant” ridiculous?

Graphic: Parasitoid emergence from aphids on peppers, as a function of soil fertilization. Analysis courtesy of Chandra Moffat (but data revisualized for clarity).

“Every time you say ‘trending towards significance’, a statistician somewhere trips and falls down.” This little joke came to me via Twitter last month. I won’t say who tweeted it, but they aren’t alone: similar swipes are very common. I’ve seen them from reviewers of papers, audiences of conference talks, faculty colleagues in lab meetings, and many others. The butt of the joke is usually someone who executes a statistical test, finds a P value slightly greater than 0.05, and has the temerity to say something about the trend anyway. Sometimes the related sin is declaring a P value much smaller than 0.05 “highly significant”. Either way, it’s a sin of committing statistics with nuance.

Why do people think the joke is funny? Continue reading

Why do we make statistics so hard for our students?

(Warning: long and slightly wonkish)

If you’re like me, you’re continually frustrated by the fact that undergraduate students struggle to understand statistics. Actually, that’s putting it mildly: a large fraction of undergraduates simply refuse to understand statistics; mention a requirement for statistical data analysis in your course and you’ll get eye-rolling, groans, or (if it’s early enough in the semester) a rash of course-dropping.

This bothers me, because we can’t do inference in science without statistics*. Why are students so unreceptive to something so important? Continue reading