Eugenics and Statistics Part Two, Reflections and Implications | Nathaniel Joselson

Eugenics and Statistics Part Two, Reflections and Implications


In the previous post I gave an historical overview of Karl Pearson and Ronald Fishers’ research and careers. They were the leading statisticians of the age and explicitly endorsed settler colonialism and population eugenics on a global scale. In retrospect, however, their views are no less disturbing than the fact that they are still held in the highest esteem by modern statisticians, possibly suffering from historical amnesia or unaware of a white-washed history. This draws statistics as a discipline into a moral grey-zone, which in turn, begs the following questions: What have we normalized from that period of history that bears re-assessing? What thinking or biases have we inherited from them? How do we make sense of this history and move forward towards a decolonized statistics? These are the questions that I hope to touch on in this post.

First of all, these were some of the foremost statisticians of the time and the marks they have left on classical statistics and university curricula cannot be overstated. Pearson was the chief editor of Biometrika, the only journal of mathematical statistics at the time and used this platform to publish numerous eugenics papers, even if they didn’t introduce any novel statistical technique. This cross-pollination between the fields meant that racial eugenics (measuring skulls, disease frequencies or “intelligence” over different races) was mainstream statistical knowledge. Furthermore, many of the statistical techniques we take for granted were developed for and used in eugenics research including t-tests, discriminant analysis and even our modern form of simple linear regression.

However, even more concerning than the uncomfortable content is the language and statistical thinking which we have inherited, namely an obsession with “significant difference.” This is seen as the backbone of classical statistics, and in Pearson’s book The Grammar of Science he devotes a whole chapter to unpacking “correct” statistical research. This research is hypothesis driven where the null hypothesis to be proven wrong is that there is no difference between the populations. Then if the populations can be proven different after performing a t-test or an F-test or a z-test or a chi-squared test, then the research was successful and is worth publishing. It is this exact format that is taught to this day in undergraduate statistics.

Never mind that these tests can be influenced by a large sample size so that even slight differences can be proven “significant;” never mind they were developed to ‘prove’ such hypotheses, as “there is significant difference in personal cleanliness between white and non-white children;” and never mind the fact that disproving one hypothesis doesn’t necessarily prove the other to be true; these tests are the basis of classical statistics, so thus should form the basis of every statistician’s knowledge, right?

To me, this appeal to the biases of the past for justification of inaction in the present doesn’t make sense. We must be critical and forward thinking, so we aren’t afraid to say that obsession with difference is divisive and categorization is colonial. Just as we need to unlearn colonial categorization of objects and people as economically productive or unproductive, we need to unlearn categorization of statistical differences as significant or not significant. Statistical significance, like every other observable phenomenon (gender, sexuality, race), is on a spectrum, not a categorical scale. Post-colonial understanding of statistical significance doesn’t rely on p-values less than 0.05, but rather on holistic understanding of the data in question and the mathematical properties of the population that generated it. This is knowledge discovery, not hypothesis testing. This is interdisciplinary research, not isolated extrapolation by inherently biased human beings.

Until we realize that these “proofs” of significant difference justified many years of violence in the name of white supremacy, we will never teach decolonized statistics. In fact, we will continue to produce statisticians who use these tools to justify modern day violence perpetuated in the name of statistical significance. In a later post I will go into detail about some inherently racist mathematical models for crime prevention, insurance pricing and financial aid allocation that justify their bias because race is a statistically significant predictor. But for now, does that mean that there is no place to teach Fisher and Pearson’s classical statistics to students? Not necessarily. In a decolonized statistics we must know and critique previous knowledge, while all the while being careful what the implications are of what we teach and are taught, especially the implications of emphasizing significant difference.

Written on November 2, 2016