September 2022 - Probably Overthinking It

On pace for 2-hour marathon in 2035

September 25, 2022 AllenDowney

On September 25, 2022, Eliud Kipchoge ran the Berlin Marathon in 2:01:09, breaking his own world record by 30 seconds and taking another step in the progression toward a two-hour marathon.

In a previous article, I noted that the marathon record speed since 1970 has been progressing linearly over time, and I proposed a model that explains why we might expect it to continue. Based on a linear extrapolation of the data so far, I predicted that someone would break the two hour barrier in 2036, plus or minus five years.

Now it is time to update my predictions in light of the new record. The following figure shows the progression of world record speed since 1970 (orange dots), a linear fit to the data (green line) and a 90% predictive confidence interval (shaded area).

This model predicts that we will see a two-hour marathon in 2035 plus or minus 6 years. Since the last two points are above the long-term trend, we might expect to cross the finish line on the early end of that range.

This analysis is one of the examples in Chapter 17 of Think Bayes; you can read it here, or you can click here to run the code in a Colab notebook.

If you like this sort of thing, you will like my forthcoming book, called Probably Overthinking It, which is about using evidence and reason to answer questions and guide decision making. If you would like to get an occasional update about the book, please join my mailing list.

Polarization and partisan sorting

September 11, 2022 AllenDowney

I’m working on a book called Probably Overthinking It that is about using evidence and reason to answer questions and guide decision making. If you would like to get an occasional update about the book, please join my mailing list.

In the previous article, I used data from the General Social Survey (GSS) to show that polarization on an individual level has increased since the 1970s, but not by very much. I identified fifteen survey questions that distinguish conservatives and liberals; for each respondent, I estimated the number of conservative responses they would give to these questions.

Since the 1970s, the average number of conservative responses had decreased consistently. The spread of the distribution has increased slightly, but if we quantify the spread as a mean absolute difference, here’s what it means: in 1986, if you chose two people at random and asked them the fifteen questions, they would differ on 3.4 questions, on average. If you repeated the experiment in 2016, they would differ by 3.6.

That is not a substantial change.

But even without polarization at the individual level, there can be polarization at the level of political parties. So that’s what this article is about.

Here’s an overview of the results:

In the 1970s, there was not much difference between Democrats and Republicans. On average, their answers to the fifteen questions were about the same.
Since then, both parties have moved to the left (and nonpartisans, too). But the Democrats have moved faster, so the gap between the parties has increased.

Although it is tempting to interpret these results as a case of Democrats careening off to the left, that is a misleading way to tell the story, because it sounds like we are following a group of people over time. The groups we call Democrats and Republicans are not the same groups of people over time. The composition of the groups changes due to:

Generational replacement: In the conveyor belt of demography, when old people die, they are replaced by young people, and
Partisan sorting: No one is born Democrat or Republican; rather, they choose a party (or not) at some point in their lives, and this sorting process has changed over time.

These two factors explain the increasing ideological difference between Democrats and Republicans: polarization between the parties is not caused by people changing their minds; it is caused by changes in the composition of the groups.

Let me show you what I mean.

The race to the bottom

Each GSS respondent was asked “Generally speaking, do you usually think of yourself as a Republican, Democrat, Independent, or what?” Their responses were coded on a 7-point scale, but for simplicity, I’ve reduced it to three: Republican, Democrat, and Nonpartisan (which I think is more precise than “Independent”).

The following figure shows how the percentage of respondents in each group has changed over time:

The percentage of Democrats was decreasing until 2000; the percentage of Republicans has decreased since. But there is no sign of increasing partisanship. In fact, the percentage of Nonpartisans has increased to the point where they are now the plurality.

Now let’s see what these groups believe, on average. The following figure shows the estimated number of conservative responses for each group over time.

In the 1970s, there was not much difference between Democrats and Republicans. In fact, Democrats were more conservative than Nonpartisans. Since then, all three groups have moved to the left; that is, they give more liberal responses to the fifteen questions. But Democrats have moved faster than the other groups.

These are just averages; we can get a better view of what’s happening by looking at the distributions. The following figure shows the distribution of responses for the three groups in 1988 and 2018.

In 1988, the three groups were almost indistinguishable. In 2018, they still overlap substantially, but Democrats have shifted farther to the left than the other two groups. As a result, the difference in the means between the groups has increased, as shown in this figure:

Now it might be clearer why I chose 1988, which is close to the lowest point in the long-term trend, and 2018, which is the most recent point that is not subject to the effects of data collection during the pandemic.

It might be tempting, especially for conservative Republicans, to interpret these graphs as a case of Democrats going off the rails, but I think that’s misleading. Again, these groups are not made up of the same people over time, so this is not a story about people whose views are changing. It is a story about groups whose composition is changing.

Partisan sorting

In the 1970s, there was not much difference between Democrats and Republicans, in term of their political views. Now, in the 2020s, there is. So what changed?

To find out, let’s look at the relationship between conservatism, as measured by responses to the fifteen questions, and party affiliation. The following figure shows a scatter plot of these values in 1988 and 2018. The purple lines show a smoothed average of party affiliation as a function of conservatism.

In 1988, the relationship between these values was weak. Someone who gave conservative responses to the questions was only marginally more likely to identify as Republican; someone who gave liberal responses was only marginally more likely to identify as Democrat.

Since then, the relationship has grown stronger. The following figure shows the correlation between these values over time:

In 2018, the correlation was about 0.15, which is quite weak; in 2018 it was almost 0.4. That’s substantially higher, but it is still not a strong correlation. As you can see in the scatter plots, there are still people with liberal views who call themselves Republicans (even if Republicans have different names for them) and people with conservative views who consider themselves Democrats.

This is why I think it’s misleading to say that Democrats have moved to the left; rather, people have sorted themselves into parties according to their beliefs, at least more than they used to.

Consider this analogy: In my first year of middle school, we all took the same physical education class, so we all played the same sports. During some weeks, everyone played basketball; during other weeks, everyone wrestled. So that year, the wrestlers and the basketball players were the same height, on average.

The next year, we got to choose which sports to play. As you would expect, taller people were more likely to choose basketball and shorter people were more likely to wrestle. So, all of a sudden, the basketball players were taller than the wrestlers, on average.

Does that mean the basketball players got taller and the wrestlers got shorter? That would not be a reasonable interpretation, because they were not the same groups of people. The increased difference between the groups was entirely because of how they sorted themselves out.

Let’s see if the same is true for Democrats and Republicans.

Getting counterfactual

How much of the increased difference since the 1980s can be explained by partisan sorting? To answer that question, I used ordinal logistic regression to model the relationship between conservative views and party affiliation for each year of the survey. Then I used the model to simulate the sorting process under two scenarios:

With observed changes partisan sorting: In this version, I used the model from each year to simulating the sorting process each year — so we expect the results to be similar to the actual data.
With no change in partisan sorting: In this version, I built a model using data from the low correlation period (1973 to 1985) — then I used this model to simulate the sorting process for every year.

The second scenario is meant to simulate what would have happened if the sorting process had not changed. The following figure shows the results.

When the model includes the observed changes in partisan sorting, it matches the data well, which shows that the model includes the essential features that replicate the observed increase in the difference between Democrats and Republicans.

When we run the model with no increase in partisan sorting, there is no increase in the difference between Democrats and Republicans. This result shows that the observed increase in partisan sorting is sufficient to explain the entire increase in difference between the parties.

Alignment is not polarization

Compared to the 1980s, there is more alignment now between political ideology and political parties. Liberals are more likely to identify as Democrats and conservatives are more likely to identify as Republicans. As a result, the parties are more differentiated now than they were.

This alignment might not be a bad thing. In a two party system, it might be desirable if one party represents a more conservative world view than the other. If voters have only two options, the options should be different in ways voters care about. That way, at least, you know what you are voting for. The alternative, with no substantial difference between parties, is a recipe for voter frustration and disengagement.

Of course, there are problems with extreme partisanship. But I don’t think a moderate level of alignment — which is what we have — is necessarily a problem.

Are we really polarized?

September 5, 2022 AllenDowney

I’m working on a book called Probably Overthinking It that is about using evidence and reason to answer questions and guide decision making. If you would like to get an occasional update about the book, please join my mailing list.

I’m a little tired of hearing about how polarized we are, partly because I suspect it’s not true and mostly because it is “not even wrong” — that is, not formulated as a meaningful hypothesis.

To make it meaningful, we can start by distinguishing “mass polarization“, which is movement of popular attitudes toward the extremes, from polarization at the level of political parties. I’ll address mass polarization in this article and the other kind in the next.

The distribution of attitudes

To see whether popular opinion is moving toward the extremes, I’ll use data from the General Social Survey (GSS). And I’ll build on the methodology I presented in this previous article, where I identified fifteen questions that most strongly distinguish conservatives and liberals.

Not all respondents were asked all fifteen questions, but with some help from item response theory, I estimated the number of conservative responses each respondent would give, if they had been asked. From that, we can estimate the distribution of responses during each year of the survey. For example, here’s a comparison of distributions from 1986 and 2016:

First, notice that the distributions have one big mode near the middle, not two modes at the extremes. That means that most people choose a mixture of liberal and conservative responses to the questions; the people at the extremes are a small minority.

Second, the distribution shifted to the left during this period. In 1986, the average number of conservative responses was 7.6; in 2016, it was 5.5.

Now, let’s see what happened during the other years.

It’s just a jump to the left

To summarize how the distribution of responses has changed over time, we’ll look at the mean, standard deviation, and mean absolute difference. Here’s the mean for each year of the survey:

The average level of conservatism, as measured by the fifteen questions, has been declining consistently for the duration of the GSS, almost 50 years.

The ‘x’ markers highlight 1986 and 2016, the years I chose in the previous figure. So we can see that these years are not anomalies; they are both close to the long-term trend.

Measuring polarization

Now, to see if we are getting more polarized, let’s look at the standard deviation of the distribution over time:

The spread of the distribution was decreasing until the 1980s and has been increasing ever since. The value for 2021 is substantially above the long term trend, but it’s too early to say whether that’s a real change in polarization or an artifact of pandemic-related changes in data collection.

Again, the ‘x’ markers highlight the years I chose, and show why I chose them: they are close to the lowest and highest points in the long-term trend.

So there is some evidence of popular polarization. In fact, using standard deviation to quantify polarization might underestimate the size of the change, because the tail of the most recent distribution is compressed at the left end of the scale.

However, it is hard to interpret a change in standard deviation in practical terms. It was 3.0 in 1986 and 3.2 in 2016; is that a big change? I don’t know.

Don’t get MAD, get mean average difference

We can mitigate the effect of compression and help with interpretation by switching to a different measure of spread, mean absolute difference, which is the average size of the differences between pairs of people. Here’s how this measure has changed over time:

The mean absolute difference (MADiff) follows the same trend as standard deviation. It decreased until the 1980s and has increased ever since. It was 3.4 in 1986, which means that if you chose two people at random and asked them the fifteen questions, they would differ on 3.4 questions, on average. If you repeated the experiment in 2016, they would differ by 3.6.

This measure of polarization suggests that the increase since the 1980s has not been big enough to make much of a difference. If civilized society can survive when people disagree on 3.4 questions, it’s hard to imagine that the walls will come tumbling down when they disagree on 3.6.

In conclusion, it doesn’t look like mass polarization has changed much in the last 50 years, and certainly not enough to justify the amount of coverage it gets.

But there is another kind of polarization, at the level of political parties, that might be a bigger problem. I’ll get to that in the next article.

Probably Overthinking It

Data science, Bayesian Statistics, and other ideas

Browsed by
Month: September 2022