Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Saturday, 27 October 2018

War is not yet over

Rumours of the end of war are greatly exaggerated...

I think my next long read has to be something by Pinker. It feels very unfair to read critiques without reading what they're criticising. And I'm not familiar with all of the methods here either. Bearing that in mind though, I didn't spot anything obviously wrong.

The overall gist seems to be that Pinker is not properly accounting for the stochastic, extreme (but bounded, as causalities cannot take any arbitrary value) nature of war and the incompleteness of the data. Rather than examining conflict as a continuous process, in which wars are tail-end extremities but not representative of the usual occurrences (or equivalently, especially destructive conflicts are effectively statistical flukes when considering war in general), the correct approach is to study the frequency of those tail-end events over time. And there's not much evidence that these are getting rarer. Historically, the interval between major conflicts has been so long that there just hasn't been enough time since the last one to make any conclusion at all.

My only significant caveat would be that they don't clearly describe if Pinker is claiming violence overall is declining, or only warfare. So the nature of what they're trying to refute is a bit confused. I get the impression that Pinker is describing the overall levels, but here they specifically and exclusively concentrate on warfare. Fair enough if the original claim is that war is becoming rarer, but not necessarily if it's about violence. Living with the risk of war and the risk of being murdered are quite different prospects, I think.

From the paper (it's 26 pages so yer not allowed to claim TLDR for the summary) :

First, we are dealing with a “fat-tailed” phenomenon. What characterizes fat tailed variables? These have their properties (such as the average) dominated by extreme events, those "in the tails". Further, historical data are temporal (spread out over time) and statistical analyses of time series (such as financial data) require far more sophistication than simple statistical tests found in empirical scientific papers (there is a difference between ensemble probability and time probability, though not always, and the effect of the bias needs to be established).

The analysis needs to incorporate the unreliability of historical data – there is no way to go back and fact-check the casualties in the Peloponesian war and we rarely only have more than one side to the story. Estimates of war casualties are often anecdotal, spreading via citations, and based on vague computations, almost impossible to verify using period sources... the number of gaps between wars can be treated as a random variable, and its effect must be taken into consideration in the interpretation of the results.

Pinker (2011) treats as a single event the “Mongolian invasions” which lasted more than a century and a quarter. This swelled the numbers per event over the Middle Ages and contributed to the illusion that violence has dropped since, given that subsequent “events” had shorter durations. Likewise, the data makes it hard to assess whether the numbers include people who died of side effects of wars – say for example it makes a difference whether the victims of famine from the siege of Jerusalem are included or not in the historical figures.

At the core, Pinker’s severe mistake is one of standard naive empiricism – basically mistaking data (actually absence of data) for evidence and building his theory of why violence has dropped without even ascertaining whether violence did indeed drop. This is not to say that Pinker’s socio-psychological theories can’t be right: they are just not sufficiently connected to data to start remotely looking like science. Fundamentally, statistics is about ensuring people do not build scientific theories from hot air, that is without significant departure from random. Otherwise, it is patently "fooled by randomness". And we have a very clear idea what departure from random means.

For fat tailed variables, the conventional mechanism of the law of large numbers (on which statistical inference reposes) is considerably slower and significance requires more data and longer periods. Simply, the sample average is not a good estimator of the “true” mean; it has what is called a small sample bias when data is one-tailed (i.e. can only take either positive or negative values, as is the case with violence). In other words, not only do we need a lot of data to know what’s going on, but , as in the case of violence, we should expect that the mean violence as measured in sample to be lower than the true mean. Ironically, there are claims that can be done on little data: inference is asymmetric under fat-tailed domains. We require more data to assert that there are no black swans than to assert that there are black swans, hence we would need much more data to claim a drop in violence than to claim a rise in it.

The second – more serious – error Pinker made in his conclusion is to believe that tail events and the mean are somehow different animals, not realizing that the mean includes these tail events. Further, for fat-tailed variables, the mean is almost entirely determined by extremes. If you are uncertain about the tails, then you are uncertain about the mean. It is thus incoherent to say that violence has dropped but maybe not the risk of tail events; it would be like saying that someone is "extremely virtuous except during the school shooting episode when he killed 30 students", or that nuclear weapons are very safe as they only kill a small percentage of the time.

We think it is important to stress that our data set, despite its evident temporal connotation, does not form a proper time series. It is in fact trivial to notice that the different conflicts of humanity do not share the same set of causes. Battles belonging to different centuries and continents are not only independent, but also surely have different origins. In statistical words, we cannot assume the existence of a unique conflict generator process, as if conflicts were coming from the same source.

For this reason, we believe that performing time series analysis on this kind of data is useless, if not dangerous, given that one could extrapolate misleading trends, as done for example in Pinker (2011). How could the An Lushan rebellion in China (755 AD) be dependent on the Siege of Constantinople by the Arabs (717 AD), or have an impact on the Viking Raids in Ireland (from 795 AD on)?

And then, after accounting for the bounded, tail-heavy nature of war and using more robust statistical methods (they say their conclusions are robust so long as the data is no more than 30% incomplete or wrong; they restrict themselves to conflicts with more than 3,000 (seems low to me, but sometimes they use much higher) causality rates, which should not suffer from too much incompleteness) :

Conclusion : Is there any trend? The short answer is no. Our data do not support the presence of any particular trend in the number of armed conflicts over time. Humanity seems to be as belligerent as always. No increase, nor decrease.

Naturally we are speaking about the type of conflicts for which we have performed our analysis, that is to say the largest and most destructive ones. We cannot say anything about small fights with a few casualties, since they do not belong to our data set - however it is crucial that, as a central property of the fat-tailedness of the process, a decline in homicide does not affect the total properties of violence and anyone’s risk of death. As we said, the mean is tail driven.

If we focus our attention on our data set, and in particular on the observations belonging to the last 600 years (from 1500 AD on), for which missing observations should be fewer and reporting errors smaller, our analyses suggest that the number of large conflicts over time follows a homogeneous Poisson process... In simple terms, this finding supports the idea that wars are randomly distributed accidents over time, not following any particular trend, as already pointed out by Richardson (1960).

1 comment:

  1. I've read it a couple of years back. The hypothesis is that violence of all sorts is declining. Considering at least some of any observed decline has to be due to an increase in empathy due to awareness of the effects, it has always seemed like a bad idea to go around trumpeting that there's a decline.


Substituting one's own reality

No-one would sensibly suggest that our perception of reality gives us a complete picture. This more extreme view, however, strikes me as poi...