December 2019 - Probably Overthinking It

Please stop teaching people to write about science in the passive voice

December 18, 2019 AllenDowney

You might think you have to, but you don’t and you shouldn’t.

Why you might think you have to

Science is objective and it doesn’t matter who does the experiment, so we should write in the passive voice, which emphasizes the methods and materials, not the scientists.
You are teaching at <a level of education> and you have to prepare students for the <next level of education>, where they will be required to write in the passive voice.

Why you don’t have to

Regardless of how objective we think science is, writing about it in the passive voice doesn’t make it any more objective. Science is done by humans; there is no reason to pretend otherwise.

If you are teaching students to write in the passive voice because you think they need it at the next stage in the pipeline, you don’t have to.

If they learn to write in the active voice now, they can learn to write in the passive voice later, when and if they have to. And they might not have to.

A few years ago I surveyed the style guides of the top scientific journals in the world, and here’s what I found:

None of them require the passive voice.
Several of them have been begging scientists for decades to stop writing in the passive voice.

Here is the style guide from Science, from 1968, and it says:

“Choose the active voice more often than you choose the passive, for the passive voice usually requires more words and often obscures the agent of action.”

Here’s the style guide from Nature:

“Nature journals like authors to write in the active voice (“we performed the experiment…” ) as experience has shown that readers find concepts and results to be conveyed more clearly if written directly.”

From personal correspondence with the production department at the Proceedings of the National Academy of Sciences USA (PNAS), I learned:

“[We] feel that accepted best practice in writing and editing favors active voice over passive.”

Top journals agree: you don’t have to teach students to write in the passive voice.

Why you shouldn’t

As a stylistic matter, excessive use of the passive voice is boring. As a practical matter, it is unclear.

For example, the following is the abstract of a paper I read recently. It describes prior work that was done by other scientists and summarizes new work done by the author. See if you can tell which is which.

The Lotka–Volterra model of predator–prey dynamics was used for approximation of the well-known empirical time series on the lynx–hare system in Canada that was collected by the Hudson Bay Company in 1845–1935. The model was assumed to demonstrate satisfactory data approximation if the sets of deviations of the model and empirical data for both time series satisfied a number of statistical criteria (for the selected significance level). The frequency distributions of deviations between the theoretical (model) trajectories and empirical datasets were tested for symmetry (with respect to the Y-axis; the Kolmogorov–Smirnov and Lehmann–Rosenblatt tests) and the presence or absence of serial correlation (the Swed–Eisenhart and “jumps up–jumps down” tests). The numerical calculations show that the set of points of the space of model parameters, when the deviations satisfy the statistical criteria, is not empty and, consequently, the model is suitable for describing empirical data.
L. V. Nedorezov “The dynamics of the lynx–hare system: an application of the Lotka–Volterra model“.

Who used the model? Who assumed it was satisfactory? And who tested for symmetry?

I don’t know.

Please don’t teach students to write like this. It’s bad for them and anyone who has to read what they write, and it’s bad for science.

Handicapping pub trivia

December 8, 2019 AllenDowney

Introduction

The following question was posted recently on Reddit’s statistics forum:

If there is a quiz of x questions with varying results between teams of different sizes, how could you logically handicap the larger teams to bring some sort of equivalence in performance measure?
[Suppose there are] 25 questions and a team of two scores 11/25. A team of 4 scores 17/25. Who did better […]?

One respondent suggested a binomial model, in which every player has the same probability of answering any question correctly.

I suggested a model based on item response theory, in which each question has a level of difficulty, d, each player has a level of efficacy e, and the probability that a player answers a question is

expit(e-d+c)

where c is a constant offset for all players and questions and expit is the inverse of the logit function.

Another respondent pointed out that group dynamics will come into play. On a given team, it is not enough if one player knows the answer; they also have to persuade their teammates.

Me (left) at pub trivia with friends in Richmond, VA. Despite our numbers, we did not win.

I wrote some simulations to explore this question. You can see a static version of my notebook here, or you can run the code on Colab.

I implement a binomial model and a model based on item response theory. Interestingly, for the scenario in the question they yield opposite results: under the binomial model, we would judge that the team of two performed better; under the other model, the team of four was better.

In both cases I use a simple model of group dynamics: if anyone on the team gets a question, that means the whole team gets the question. So one way to think of this model is that “getting” a question means something like “knowing the answer and successfully convincing your team”.

Anyway, I’m not sure I really answered the question, other than to show that the answer depends on the model.

Political alignment and beliefs about homosexuality

December 3, 2019 AllenDowney

In the United States, beliefs and attitudes about homosexuality have changed drastically over the last 50 years. In 1972, 74% of U.S. residents thought sexual relations between two adults of the same sex were “always wrong”, according to results from the General Social Survey (GSS). In 2018, that fraction was down to 33%, and another 58% thought same-sex relations were “not wrong at all”.

Here’s what the distribution of responses looks like over the duration of the survey:

Distribution of responses to the question “What about sexual relations between two adults of the same sex—do you think it is always wrong, almost always wrong, wrong only sometimes, or not wrong at all?”

In the late 1980s, the fraction of “always wrong” responses started dropping, being replaced almost entirely with “not at all wrong”. Respondents who chose “almost always wrong” or “sometimes wrong” have always been a small minority.

Political alignment

As you might expect, these responses are related to political alignment, that is, to whether respondents describe themselves as liberal, conservative, or moderate.

The following figure shows the fraction of “always wrong” responses over time, grouped by political alignment:

Fraction of respondents who think sexual relations between two adults of the same sex are “always wrong”, grouped by self-described political affiliation.

The circles in this figure show the observed percentages in each group during each year. The lines show a smooth curve computed by local regression.

Unsurprisingly, people who consider themselves conservative are consistently more likely than liberals to believe homosexuality is wrong. And moderates fall somewhere between liberals and conservatives.

What might be more surprising is how conservative self-described liberals were in 1972: almost 60% of them thought homosexuality was always wrong.

You might also be surprised at how liberal self-described conservatives are now: the fraction who think homosexuality is wrong is down to 60%. In other words, conservatives now are as liberal as liberals were in 1972.

The more things change…

As we saw in a previous article, the fractions of liberals and conservatives do not change much over time. The following figure shows the proportions for GSS respondents:

Self-described political alignment over time.

I conjecture that people describe themselves relative to a perceived center of mass of public opinion. If they are more conservative than what they think is the mean, they are more likely to say they are “conservative”.

But what that means, in terms of beliefs and attitudes, changes over time. And with some issues, it changes quite fast.

The data and code I used for this article are in this GitHub repository. If you would like to run the same analysis with other variables in the GSS, you can run this Jupyter notebook.

Probably Overthinking It

Data science, Bayesian Statistics, and other ideas

Browsed by
Month: December 2019