The J-shaped curve of blog-post popularity

Warning: navel-gazing.

Believe it or not (and I have some trouble believing it myself), I’ve written 235 posts for Scientist Sees Squirrel over the not-quite-three-years of its existence.  Some have made waves.  Others have vanished into the deep waters of the internet without the hint of a ripple.

I got thinking about this because last month I wrote a post called Statistics in Excel, and when is a Results section too short?.  It turned out, to my surprise, to be one of the “wave” ones: it was read just over 3,000 times in its first 48 hours.  I’m pretty sure that’s more eyeballs than my entire body of published work (79 papers plus The Scientist’s Guide to Writing) gets in a year*A few of my posts have been like that – not “viral” on a celebrity kind of scale, but read remarkably widely.  They almost always surprise me.  But I’m a scientist, and scientists like data, and WordPress provides some… so the plot above is a rank-frequency curve for the “lifetime” views of every post I’ve run since Scientist Sees Squirrel debuted**.  I see two interesting things in the curve – one at each end.

The first thing is that there’s some pattern in what kind of posts go semi-viral.  It’s not random.  Posts about statistics (coded with green) are pretty popular (so I guess I shouldn’t have been surprised by Statistics in Excel).  So are posts about peer review (coded with brown).  Perhaps this is a signal that I should write more posts about these topics, but I’m not going to.  Granted, I’ll continue to have stray thoughts about statistics and about peer review, and with the arrogance of the blogger, I’ll probably inflict them on you.  But I don’t plan to go looking for semi-viral topics.  That’s partly because having a post widely read makes me uncomfortable (and yes, I know this is weird).  More importantly, the popular posts aren’t necessarily my favourite posts to write.

Which brings me to the second interesting pattern.  My posts about the etymologies of Latin names (coded with pink) are… well, let’s say they have a niche audience, because that sounds better than “often, nobody cares”.  And yet, I love writing posts about the etymologies of Latin names.  (Haven’t read one of those?  You’re not alone.  I swear they’re more interesting than they sound.  Try starting with this one.)  I’m going to keep writing this kind of post, and mostly they’ll keep vanishing into the void achieving their niche audience, and that will just have to be that, I guess.

The J-shaped curve that’s so obvious in my post-popularity plot is a very general phenomenon.  You’d get the same shape, more or less, if you plotted sales for all the books on Amazon, or radio plays for all the rock songs every recorded, the wealth of billionaires or the frequency of Amazonian trees, or any number of other things.   Early in the development of the World Wide Web, there were many suggestions that digitalization and the internet would work to flatten these J-shaped curves, because there would be little cost for a consumer to find niche content or for a producer to distribute and promote it.  I don’t have data at hand, but my understanding is that in fact, the opposite has been true: the internet has proven a spectacularly effective tool for concentrating attention (on the latest viral cat video), not leveling it.  But digitization and the internet have at least allowed niche content to exist, so it can be found by its handful of avid consumers.  My pet posts – my etymologies of Latin names – have their handful of avid readers.  I’ll continue to write and post them for that handful (but mostly for me).  Blogging lets me write what I’m interested in.

Sorry about that.  I gather you’d all rather hear more about Statistics and Excel.  Hey, this blog is free, and you get what you pay for. 🙂

© Stephen Heard  December 7, 2017

*^Possibly much more, but I’d prefer not to think too hard about this.

**^These are raw view counts, uncorrected for age of post.  Older posts have had more time to accumulate views.  But pushing in the other direction, older posts first ran on a blog with a smaller audience.  It would take more careful analysis than I want to do to tease these effects apart.


17 thoughts on “The J-shaped curve of blog-post popularity

  1. The Visualizer

    Your post is an insightful one, Sir.
    In a way, it does explain why some content on the Internet become viral memes, while others don’t, despite being available in equal proportions of time.


  2. Artem Kaznatcheev

    I feel like these sorts of posts are useful for academic bloggers. They show you the sort of reach that blogging can have that our papers are seldom capable of achieving. I’ve written similar things for the 200th post on TheEGG, but haven’t had a chance to update the stats when I passed 250. I’ll definitely add a rank versus views plot to future iterations. It’d also be interesting to see how the exponent of the decay on these J-curves compares between blogs and to other media or model systems.

    I also like the motto of write for yourself. Another good guiding principle I heard: blog any email that gets too long. I’ve done this several time with feedback on preprints or recently published papers that I would have normally sent to just the authors but instead was able to avoid locking that conversation and knowledge in places only a few will see. Although this has to be done carefully as a blog starts to reach more people. Since it can become too tall of a soapbox.

    And blogs can grow a lot. At least for TheEGG, it was exponential growth from 2011 to 2015 with a readership doubling time of around 1.5 years. At least until I saturated my niche, or stumbled on blogging productivity. Have you found similar growth numbers over time? I feel like this would be another good thing to know for people that are starting, since the early days of a blog can be very lonely.

    So thanks for writing this. And is this technically a statistics post? Should we expect it to break into your top 100? 😉

    Liked by 1 person

  3. Macrobe

    Some of us may have reached saturation with all the stats posts circulating through the blogs like lost migrating birds.
    I, for one, enjoy your posts on Latin name etymologies. I share the same odd interest (as well as etymologies of common names). They often provide a window into many interesting aspects of science (and of ourselves) few readers take time or effort to learn.
    This is your blog. You have permission to write about things you enjoy.
    Carry on!


  4. John Dutton

    I’m an amateur generalist and love your posts. The wilder the better but particularly the ones about technical writing styles and of course excel statistics. Have a very merry Christmas.

    Liked by 1 person

  5. Jeremy Fox

    All of this lines up with our experience at Dynamic Ecology, and the data I’m aware of.

    Question: Is it widely known that the distribution of attention paid to anything (blog posts, movies, scientific papers, news events, whatever0 is very highly skewed and always has been, no matter how you define or measure “attention”? And that you can’t change that by changing what filters people use to allocate their attention?* I feel like it must be widely known, but maybe that’s just because I happen to know it? So I’m curious–do lots of scientists mistakenly think that blogs, or Twitter, or preprint servers, or Google Scholar’s recommendation algorithm, or etc., are “democratizing” forces in the sense of creating less-skewed distributions of attention towards scientific work?

    Or perhaps I’m wrong. Maybe most everybody does know that the distribution of attention is invariably highly skewed. It’s just that some people want everyone to use different filters, so that 99% of the attention ends up going to a different 1% of the stuff than it currently does?

    *Absent some very strong mechanism for forcible redistribution of attention, of course. That’s one way to think of pre-publication peer review: as a mechanism that forces a roughly equal amount of attention to be paid to every submission. Ok, not totally equal–an ms might get desk-rejected after a quick read by an editor, another might get three reviewers rather than two, etc. But *much* more equal than the post-publication distribution of attention.


      1. Terry McGlynn

        And the funny thing is that, a priori, I have no idea which of those posts would be the ones that get a lot of views. Clearly we wouldn’t be doing this unless we saw value in doing the stuff that ends up in the long tail.


  6. Cathleen

    Nice power law curves there! Probably some fun network theory in here too… (i.e., maybe you have lots of statistically-minded followers on your twitter who are more likely to share your stats posts, and then views snowball from there based on how much network influence your sharers have.)


  7. Pingback: Arguing with myself | Scientist Sees Squirrel

  8. Pingback: Three years without an attention span | Scientist Sees Squirrel

  9. Pingback: “Guest-post SPAM” is a thing now | Scientist Sees Squirrel

  10. Pingback: Making people angry | Scientist Sees Squirrel

  11. Pingback: How to find a squirrel | Scientist Sees Squirrel

Comment on this post:

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.