Browsed by
Category: Uncategorized

The mean of a Likert scale?

The mean of a Likert scale?

Here’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

likert_mean
Testing Percentiles

Testing Percentiles

Here’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

test_percentile
Small percentiles and missing data

Small percentiles and missing data

Here’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

low_percentile
What does “strength” mean?

What does “strength” mean?

Here’s another installment in Data Q&A: Answering the real questions with Python. Previous installments are available from the Data Q&A landing page.

corr_trend
What does a confidence interval mean?

What does a confidence interval mean?

Here’s another installment in Data Q&A: Answering the real questions with Python. In general, I will try to focus on practical problems, but this one is a little more philosophical.

confidence
Standard deviation of a count

Standard deviation of a count

This post is part of a new project with the working title Data Q&A: Answering the real questions with Python. In each installment, I’ll take a question from Reddit’s statistics forum and answer it, using Python code to demonstrate. My answer is in a Jupyter notebook — see the link below to run it in Colab.

count_data
Data Q&A

Data Q&A

Today I’m starting a new project with the working title Data Q&A: Answering the real questions with Python. In each installment, I’ll take a question from Reddit’s statistics forum and answer it, using Python code to demonstrate. The first installment is a question about the harmonic mean, which is a recurring topic of discussion on Reddit. It’s in a Jupyter notebook — see the link below to run it in Colab.

harmonic
Think Python Goes to Production

Think Python Goes to Production

Think Python has moved into production, on schedule for the official publication date in July — but maybe earlier if things go well.

To celebrate, I have posted the next batch of chapters on the new site, up through Chapter 12, which is about Markov text analysis and generation, one of my favorite examples in the book. From there, you can follow links to run the notebooks on Colab.

And we have a cover!

The new animal is a ringneck parrot, I’ve been told. I will miss the Carolina parakeet that was on the old cover, which was particularly apt because it is an ex-parrot. Nevertheless, I think the new cover looks great!

Huge thanks to Sam Lau and Luciano Ramalho for their technical reviews. Both made many helpful corrections and suggestions that improved the book. Sam is an expert on learning to program with AI assistants. And Luciano was inspired by the turtles to make an improved module for turtle graphics in Jupyter, called jupyturtle. Here’s an example of what it looks like (from Chapter 5):

If you have a chance to check out the current draft, and you have any corrections or suggestions, please create an issue on GitHub.

And if you would like a copy of the book as soon as possible, you can read the Early Release version and order from O’Reilly here or pre-order the third edition from Amazon.

.

The Gender Gap in Political Beliefs Is Small

The Gender Gap in Political Beliefs Is Small

In previous articles (here, here, and here) I’ve looked at evidence of a gender gap in political alignment (liberal or conservative), party affiliation (Democrat or Republican), and policy preferences.

Using data from the GSS, I found that women are more likely to say they are liberal, and more likely to say they are Democrats, by 5-10 percentage points. But in their responses to 15 policy questions that most distinguish conservatives and liberals, men and women give similar answers.

In other words, the political gap is mostly in what people say about themselves, not in what they believe about specific policy questions.

Now let’s see if we get similar results with ANES data. As with the GSS, I looked for questions where liberals and conservatives give different answers. From those, I selected questions about specific policies, plus four questions related to moral foundations, with preference for questions asked over a long period of time. Here are the 16 topics that met these criteria:

For each question, I identified one or more responses that were more likely to be given by conservatives, which is what I’m calling “conservative responses”.

Not every respondent was asked every question, so I used a Bayesian method based on item response theory to fill missing values. You can get the details of the method here.

As in the GSS data, the average number of conservative responses has gone down over time.

Men give more conservative responses than women, on average, but the differences is only half a question, and the gap is not getting bigger.

Among people younger than 30, the gap is closer to 1 question, on average. And it is not growing.

In summary:

  • In the ANES, there is no evidence of a growing gender gap in political alignment, party affiliation, or policy preferences.
  • In both the GSS and the ANES the gap in policy preferences is small and not growing.

The details of this analysis are in this Jupyter notebook.

What about economics?

Many of the questions in the previous section are about social issues. On economic issues some of the patterns are different. Here are 15 questions I selected that are mostly about federal spending.

Unlike the social issues, which trend liberal over time, responses to these questions are almost unchanged.

In the general population, the gender gap is about 0.5 questions and not growing.

Among young adults, the gender gap is smaller, and not growing.

On a total of 30 questions where conservatives and liberal disagree, men and women provide similar responses.

Think Python third edition!

Think Python third edition!

I am happy to announce the third edition of Think Python, which will be published by O’Reilly Media later this year.

You can read the online version of the book here. I’ve posted the Preface and the first four chapters — more on the way soon!

You can read the Early Release and pre-order from O’Reilly, or pre-order the third edition on Amazon.

Here is an excerpt from the Preface that explains…

What’s new in the third edition?

The biggest changes in this edition were driven by two new technologies — Jupyter notebooks and virtual assistants.

Each chapter of this book is a Jupyter notebook, which is a document that contains both ordinary text and code. For me, that makes it easier to write the code, test it, and keep it consistent with the text. For readers, it means you can run the code, modify it, and work on the exercises, all in one place.

The other big change is that I’ve added advice for working with virtual assistants like ChatGPT and using them to accelerate your learning. When the previous edition of this book was published in 2016, the predecessors of these tools were far less useful and most people were unaware of them. Now they are a standard tool for software engineering, and I think they will be a transformational tool for learning to program — and learning a lot of other things, too.

The other changes in the book were motivated by my regrets about the second edition.

The first is that I did not emphasize software testing. That was already a regrettable omission in 2016, but with the advent of virtual assistants, automated testing has become even more important. So this edition presents Python’s most widely-used testing tools, doctest and unittest, and includes several exercises where you can practice working with them.

My other regret is that the exercises in the second edition were uneven — some were more interesting than others and some were too hard. Moving to Jupyter notebooks helped me develop and test a more engaging and effective sequence of exercises.

In this revision, the sequence of topics is almost the same, but I rearranged a few of the chapters and compressed two short chapters into one. Also, I expanded the coverage of strings to include regular expressions.

A few chapters use turtle graphics. In previous editions, I used Python’s turtle module, but unfortunately it doesn’t work in Jupyter notebooks. So I replaced it with a new turtle module that should be easier to use. Here’s what it looks like in the notebooks.

Finally, I rewrote a substantial fraction of the text, clarifying places that needed it and cutting back in places where I was not as concise as I could be.

I am very proud of this new edition — I hope you like it!