Reject Math Supremacy

December 14, 2024 AllenDowney

The premise of Think Stats, and the other books in the Think series, is that programming is a tool for teaching and learning — and many ideas that are commonly presented in math notation can be more clearly presented in code.

In the draft third edition of Think Stats there is almost no math — not because I made a special effort to avoid it, but because I found that I didn’t need it. For example, here’s how I present the binomial distribution in Chapter 5:

Mathematically, the distribution of these outcomes follows a binomial distribution, which has a PMF that is easy to compute.
from scipy.special import comb

def binomial_pmf(k, n, p):
    return comb(n, k) * (p**k) * ((1 - p) ** (n - k))
SciPy provides the comb function, which computes the number of combinations of n things taken k at a time, often pronounced “n choose k”.

binomial_pmf computes the probability of getting k hits out of n attempts, given p.

I could also present the PMF in math notation, but I’m not sure how it would help — the Python code represents the computation just as clearly. Some readers find math notation intimidating, and even for the ones who don’t, it takes some effort to decode. In my opinion, the payoff for this additional effort is too low.

But one of the people who read the draft disagrees. They wrote:

Provide equations for the distributions. You assume that the reader knows them and then you suddenly show a programming code for them — the code is a challenge to the reader to interpret without knowing the actual equation.

I acknowledge that my approach defies the expectation that we should present math first and then translate it into code. For readers who are used to this convention, presenting the code first is “sudden”.

But why? I think there are two reasons, one practical and one philosophical:

The practical reason is the presumption that the reader is more familiar with math notation and less familiar with code. Of course that’s true for some people, but for other people, it’s the other way around. People who like math have lots of books to choose from; people who like code don’t.

The philosophical reason is what I’m calling math supremacy, which is the idea that math notation is the real thing, and everything else — including and especially code — is an inferior imitation. My correspondent hints at this idea with the suggestion that the reader should see the “actual equation”. Math is actual; code is not.

I reject math supremacy. Math notation did not come from the sky on stone tablets; it was designed by people for a purpose. Programming languages were also designed by people, for different purposes. Math notation has some good properties — it is concise and it is nearly universal. But programming languages also have good properties — most notably, they are executable. When we express an idea in code, we can run it, test it, and debug it.

So here’s a thought: if you are writing for an audience that is comfortable with math notation, and your ideas can be expressed well in that form — go ahead and use math notation. But if you are writing for an audience that understands code, and your ideas can be expressed well in code — well then you should probably use code. “Actual” code.

Probably Overthinking It

Data science, Bayesian Statistics, and other ideas

Reject Math Supremacy

December 14, 2024 AllenDowney

Related

Share this:

Related