Browsed by
Month: December 2019

Please stop teaching people to write about science in the passive voice

Please stop teaching people to write about science in the passive voice

You might think you have to, but you don’t and you shouldn’t.

Why you might think you have to

  1. Science is objective and it doesn’t matter who does the experiment, so we should write in the passive voice, which emphasizes the methods and materials, not the scientists.
  2. You are teaching at <a level of education> and you have to prepare students for the <next level of education>, where they will be required to write in the passive voice.

Why you don’t have to

Regardless of how objective we think science is, writing about it in the passive voice doesn’t make it any more objective. Science is done by humans; there is no reason to pretend otherwise.

If you are teaching students to write in the passive voice because you think they need it at the next stage in the pipeline, you don’t have to.

If they learn to write in the active voice now, they can learn to write in the passive voice later, when and if they have to. And they might not have to.

A few years ago I surveyed the style guides of the top scientific journals in the world, and here’s what I found:

  1. None of them require the passive voice.
  2. Several of them have been begging scientists for decades to stop writing in the passive voice.

Here is the style guide from Science, from 1968, and it says:

“Choose the active voice more often than you choose the passive, for the passive voice usually requires more words and often obscures the agent of action.”

Here’s the style guide from Nature:

Nature journals like authors to write in the active voice (“we performed the experiment…” ) as experience has shown that readers find concepts and results to be conveyed more clearly if written directly.”

From personal correspondence with the production department at the Proceedings of the National Academy of Sciences USA (PNAS), I learned:

“[We] feel that accepted best practice in writing and editing favors active voice over passive.”

Top journals agree: you don’t have to teach students to write in the passive voice.

Why you shouldn’t

As a stylistic matter, excessive use of the passive voice is boring. As a practical matter, it is unclear.

For example, the following is the abstract of a paper I read recently. It describes prior work that was done by other scientists and summarizes new work done by the author. See if you can tell which is which.

The Lotka–Volterra model of predator–prey dynamics was used for approximation of the well-known empirical time series on the lynx–hare system in Canada that was collected by the Hudson Bay Company in 1845–1935. The model was assumed to demonstrate satisfactory data approximation if the sets of deviations of the model and empirical data for both time series satisfied a number of statistical criteria (for the selected significance level). The frequency distributions of deviations between the theoretical (model) trajectories and empirical datasets were tested for symmetry (with respect to the Y-axis; the Kolmogorov–Smirnov and Lehmann–Rosenblatt tests) and the presence or absence of serial correlation (the Swed–Eisenhart and “jumps up–jumps down” tests). The numerical calculations show that the set of points of the space of model parameters, when the deviations satisfy the statistical criteria, is not empty and, consequently, the model is suitable for describing empirical data.

L. V. Nedorezov “The dynamics of the lynx–hare system: an application of the Lotka–Volterra model“.

Who used the model? Who assumed it was satisfactory? And who tested for symmetry?

I don’t know.

Please don’t teach students to write like this. It’s bad for them and anyone who has to read what they write, and it’s bad for science.

Handicapping pub trivia

Handicapping pub trivia

Introduction

The following question was posted recently on Reddit’s statistics forum:

If there is a quiz of x questions with varying results between teams of different sizes, how could you logically handicap the larger teams to bring some sort of equivalence in performance measure?

[Suppose there are] 25 questions and a team of two scores 11/25. A team of 4 scores 17/25. Who did better […]?

One respondent suggested a binomial model, in which every player has the same probability of answering any question correctly.

I suggested a model based on item response theory, in which each question has a level of difficulty, d, each player has a level of efficacy e, and the probability that a player answers a question is

expit(e-d+c)

where c is a constant offset for all players and questions and expit is the inverse of the logit function.

Another respondent pointed out that group dynamics will come into play. On a given team, it is not enough if one player knows the answer; they also have to persuade their teammates.

Me (left) at pub trivia with friends in Richmond, VA. Despite our numbers, we did not win.

I wrote some simulations to explore this question. You can see a static version of my notebook here, or you can run the code on Colab.

I implement a binomial model and a model based on item response theory. Interestingly, for the scenario in the question they yield opposite results: under the binomial model, we would judge that the team of two performed better; under the other model, the team of four was better.

In both cases I use a simple model of group dynamics: if anyone on the team gets a question, that means the whole team gets the question. So one way to think of this model is that “getting” a question means something like “knowing the answer and successfully convincing your team”.

Anyway, I’m not sure I really answered the question, other than to show that the answer depends on the model.