Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Wednesday 4 November 2020

These statisticians will shock you !

My university education included very little about the people behind the science. Besides a few obligatory anecdotes about Fritz Zwicky, and a picture of Thompson looking incredibly smug at having just discovered the electron, there wasn't much at all even about the history of science - let alone any biographical information. In some ways this is a good thing. Science strives for objectivity : the original purpose of a parameter should not matter, only understanding how to use it. What matters most is the collective, accumulated scientific knowledge, not a detailed understanding of long-debunked notions of yesteryear.

"I discovered the electron. Now it's time for coffee."

Of course this has a weakness. With such a narrow focus on current understanding, we miss out on a philosophical exploration of what happened : sure, we can learn about the current best practises, but at the expense of understanding how we got to where we are now. Sociologically, it's not a good idea to be totally ignorant of how discoveries happened - it brushes aside both ugly facts and beautiful truths (the story of Ruby Payne Scott alone encapsulates both). This is not healthy. We should have some inkling of whether a discovery was made by a racist maniac or a disabled lesbian war veteran. It's inherently a good thing to know. And while our investigations into galaxies and hydrodynamics probably benefit from a coolly analytic approach, and couldn't be much influenced by racist ideologies even if we wanted them to, the same cannot be said for sociological statistics. Keeping things "objective" in this case is arguably dehumanising. 

My suggestion is that each lecture course in the sciences should dedicate one, or at most two, additional lectures to the history of its subject. It wouldn't have to form part of the assessment. It would add a lot of interest in what are otherwise some very dry areas, and well as providing information which is all too easily forgotten. A couple of quotes on the monsters in the field will suffice.

Fisher continued to have disturbingly close ties to Nazi scientists even after the war. He issued public statements to help rehabilitate the image of Otmar Freiherr von Verschuer, a Nazi geneticist and advocate of racial hygiene ideas who had been a mentor to Josef Mengele, who conducted barbaric experiments on prisoners in Nazi camps. In von Verschuer’s defense, Fisher wrote, “I have no doubt also that the [Nazi] Party sincerely wished to benefit the German racial stock, especially by the elimination of manifest defectives, such as those deficient mentally, and I do not doubt that von Verschuer gave, as I should have done, his support to such a movement.”

According to Pearson, conflict between races was inevitable and desirable because it helped weed out the bad stock. As he put it, “History shows me one way, and one way only, in which a high state of civilization has been produced, namely the struggle of race with race, and the survival of the physically and mentally fitter race.”... Pearson considered the colonial genocide in America to be a great triumph because “in place of the red man, contributing practically nothing to the work and thought of the world, we have a great nation, mistress of many arts, and able ... to contribute much to the common stock of civilized man.”

Which would probably be a "holy shit !" moment from John Oliver. The average man in the street probably hasn't heard of Fisher or Pearson, but the average scientist probably has. I've used their findings - their impact was far-reaching indeed.

As a statistician, Fisher is personally responsible for many of the basic terms that now make up the standard lexicon, such as “parameter estimation,” “maximum likelihood,” and “sufficient statistic.” But the backbone of his contributions was significance testing. Fisher’s 1925 textbook Statistical Methods for Research Workers, containing statistical recipes for different problems, introduced significance testing to the world of science and became such the industry standard that anyone not following one of his recipes would have difficulty getting published... In the process, his methods made possible whole new research hypotheses: questions like whether two variables were correlated, or whether multiple populations all had the same mean.

Which doesn't sound like a fundamentally racist ideology, and it isn't. What Fisher got wrong was his interpretation of what this all meant. He seems to have had an innately binary, absolutist viewpoint : this is right or wrong, the data either supports a conclusion or it doesn't, with those conclusions being inescapable from the data alone. Again, to re-use a useful quote, "it's the difference between thinking that all the facts you have are all the facts there are".

Like Pearson, Fisher maintained that he was only ever following the numbers where they took him. Significance testing, for Fisher, was a way of communicating statistical findings that was as unassailable as a logical proof. As he wrote in 1932, “Conclusions can be drawn from data alone ... if the questions we ask seem to require knowledge prior to these, it is because ... we have been asking somewhat the wrong questions.”

Last year, a letter signed by over 800 scientists called for an end to the concept of statistical significance, and the leadership of the American Statistical Association issued a blunt decree: “Don’t say ‘statistically significant.’ ” The heart of the problem with significance testing is that making binary decisions about homogeneity was never a meaningful statistical task. With enough data, looked at closely enough, some inhomogeneities and statistically significant differences will always emerge. In the real world, data are always signaling something. It just may not be clear what.

As first argued by psychologist Edwin Boring in 1919, a scientific hypothesis is never just a statistical hypothesis—that two means in the population are different, that two variables are correlated, that a treatment has some nonzero effect—but an attempt at explaining why, by how much, and why it matters. The fact that significance testing ignores this is what economists Deirdre McCloskey and Stephen Ziliak in their 2008 book The Cult of Statistical Significance called the “sizeless stare of statistical significance.” As they put it, “Statistical significance is not a scientific test. It is a philosophical, qualitative test. It does not ask how much. It asks ‘whether,’ ”—as in, whether an effect or association simply exists. “Existence, the question of whether, is interesting,” they said, “but it is not scientific.”

But it may not be obvious as to how this kind of thinking - flawed though it is - translates into racism. I would hope that in the modern world it's inherently obvious that you can't simply say that something is either significant or it isn't; the very fact that p can take any value ought to indicate that (dare I say it !) all by itself. Pretty much everyone uses different sorts of significance measures but hardly anyone in the sciences, in my experience, is driven to become a proponent of Eugenics Wars as some of the early statisticians were. Indeed, this binary-absolutist style of thought seems incredibly ironic : statistics, to me, inherently relies on understanding various biases. To do statistics properly inherently relies on considering alternative interpretations and trying to avoid fooling oneself. How, then, did the attempt at objectivity lead to such vile conclusions ? The answer, I think, is that it didn't.

For the purposes of eugenic discrimination, it was enough to state that distinct racial subgroups existed or there was a “significant” correlation between intelligence and cleanliness or a “significant” difference in criminality, fertility, or disease incidence between people of different socioeconomic classes. The first hypotheses were taxonomic: whether individuals could be considered to be of the same species, or whether people were of the same race. 
The separation was everything—not how much, what else might explain it, or why it mattered, just that it was there. Significance testing did not spring fully formed from the heads of these men. It was crafted and refined over the years specifically to articulate evolutionary and eugenicist arguments. Galton, Pearson, and Fisher the eugenicists needed a quantitative way to argue for the existence of such differences, and Galton, Pearson, and Fisher the statisticians answered the call with significance testing.

I would suggest that binary thinking can drive racism, but racism doesn't necessarily follow directly from it. Despite the protestations of the early statisticians, you have to have some underlying preference for it in the first place, which you can use data to incorrectly support. A career in statistics isn't going to lead to joining the KKK unless you already have a perverse and warped mindset. These early villains were expressing their existing beliefs through their undoubted analytic prowess, not having their opinions shaped by the data. They would fall firmly in the top-left of the supervillain chart. In short, crude and racist ideas drove statistical methodologies, not the other way around.

Most scientists now understand that the data do not speak for themselves and never have. Observations are always possible to interpret in multiple ways, and it’s up to the scientist and the larger community to decide which interpretation best fits the facts  What we should be asking is what causal mechanism explains the difference, whether it can be applied elsewhere, and how much benefit could be obtained from doing so.

I'm pleased to report that I made that exact same first statement, almost word for word, in my lectures on galaxy evolution :

The most obvious (and tempting) conclusion from this graph, if we take it at face value, is that chocolate makes you more intelligent. But again, we don't really know what the independent variable is here. It would be equally valid to suggest that maybe it's the other way around. Maybe when someone in a country wins a Nobel prize, the rest of the populace eat chocolate in celebration ! My point is that the data doesn't interpret itself. Interpretation is something that happens in your head - there and nowhere else - and not by the data itself.

Returning to the Nautil.us article :

 “What the future of science needs is a democratization of the analysis process and generation of analysis,” and that what scientists need to do most is “hear what people that know about this stuff have been saying for a long time. Just because you haven’t measured something doesn’t mean that it’s not there. Often, you can see it with your eyes, and that’s good enough.”

Exactly ! With the proviso that of course this doesn't mean chucking statistics out completely - that's the attitude of a racist idiot - only that they don't tell the whole story by themselves. They inform the story but never dictate it. If you can see a trend by eye, you can quantify it statistically, but if you can't, then you have to be much more careful about what the statistics are saying. Your eyes can certainly fool you, but so can numbers. There is no philosophically iron-clad method for establishing truth.

In the 1972 book Social Sciences as Sorcery, Stanislav Andreski argued that, in their search for objectivity, researchers had settled for a cheap version of it, hiding behind statistical methods as “quantitative camouflage.” Instead, we should strive for the moral objectivity we need to simultaneously live in the world and study it. “The ideal of objectivity,” Andreski wrote, “requires much more than an adherence to the technical rules of verification, or recourse to recondite unemotive terminology: namely, a moral commitment to justice—the will to be fair to people and institutions, to avoid the temptations of wishful and venomous thinking, and the courage to resist threats and enticements.”

How Eugenics Shaped Statistics - Issue 92: Frontiers - Nautilus

In early 2018, officials at University College London were shocked to learn that meetings organized by "race scientists" and neo-Nazis, called the London Conference on Intelligence, had been held at the college the previous four years. The existence of the conference was surprising, but the choice of location was not.

1 comment:

  1. Wow. I use Fisher's equation a lot in my work. I shall start calling it Kolmogorov's equation.
    BTW, in the classnotes of Classical Mechanics, which prepared for my courses, I added a lot of brief biographies, and I use to mention briefly about them in class. My favourite one is Rutherford, mainly because it lets me ask if they know Genesis; they don't, these centennials.

    ReplyDelete

Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.

Dune part two : first impressions

I covered Dune : Part One when it came out, so it seems only fair I should cover the "concluding" part as well. I'm gonna do ...