Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Thursday 8 October 2020

Super Spreaders Are Super Statistical

The usefulness of statistical parameters is hugely context-dependent. Even very simple quantities like the mean and standard deviation don't always give meaningful information. The latter is particularly vulnerable : if the distribution is indeed close to Gaussian then it's invaluable, but if it isn't, then it can be woefully misleading. But even the simple mean or median isn't immune - you can have data sets which simply don't have "typical" values that could be described by a single, simple parameter. Of course, this doesn't mean statistics is useless (far from it !), only that actually understanding what you're measuring is not so straightforward.

In the context of the pandemic, this article from The Atlantic does the best job I've yet seen of explaining why the R number is misleading. I've seen references to the "k" value before but they weren't so well-explained and I didn't really get it until now. That said, without any fear of hypocrisy I have to say that this article is far, far too long. The important bits can be reduced to a few paragraphs.

The definition of k is a mouthful, but it’s simply a way of asking whether a virus spreads in a steady manner or in big bursts, whereby one person infects many, all at once. After nine months of collecting epidemiological data, we know that this is an overdispersed pathogen, meaning that it tends to spread in clusters, but this knowledge has not yet fully entered our way of thinking about the pandemic — or our preventive practices.

A recent paper found that in Hong Kong, which had extensive testing and contact tracing, about 19 percent of cases were responsible for 80 percent of transmission, while 69 percent of cases did not infect another person. This finding is not rare: Multiple studies from the beginning have suggested that as few as 10 to 20 percent of infected people may be responsible for as much as 80 to 90 percent of transmission, and that many people barely transmit it. This highly skewed, imbalanced distribution means that an early run of bad luck with a few super-spreading events, or clusters, can produce dramatically different outcomes even for otherwise similar countries. 

It makes some degree of intuitive sense that if the spread is dominated by the super spreaders, then it will be unpredictable and chaotic and not follow any neat expectations. But who are these super spreaders ? Are they a particular demographic, some group who are naturally more infectious, or is it more a function of behaviour and/or environment ? According to this it seems to be mainly environmental, but not all factors are yet accounted for.

In study after study, we see that super-spreading clusters of COVID-19 almost overwhelmingly occur in poorly ventilated, indoor environments where many people congregate over time—weddings, churches, choirs, gyms, funerals, restaurants, and such—especially when there is loud talking or singing without masks... Cevik identifies “prolonged contact, poor ventilation, [a] highly infectious person, [and] crowding” as the key elements for a super-spreader event. Super-spreading can also occur indoors beyond the six-feet guideline, because SARS-CoV-2, the pathogen causing COVID-19, can travel through the air and accumulate, especially if ventilation is poor. 

So does the R number even really mean anything, or is it so strongly dependent on environment and other factors that we can't actually say if someone typically spreads it to N other people ? From the above it sounds like the latter is closer to reality. While this largely invalidates all those nice graphical explanations about exponential growth, it clearly doesn't mean the virus is less dangerous than thought - the figures speak for themselves on that front. What it might mean is that it's potentially more controllable. And it has interesting consequences for contact tracing : we should concentrate on going backwards, finding the source of the infections rather than who might become infected next.

The reason for backward tracing’s importance is similar to the friendship paradox: your friends are, on average, going to have more friends than you. Friendships are not distributed equally; some people have a lot of friends, and your friend circle is more likely to include those social butterflies, because how could it not? They friended you and others. And those social butterflies will drive up the average number of friends that your friends have compared with you, a regular person. (Of course, this will not hold for the social butterflies themselves, but overdispersion means that there are much fewer of them.)

Or in short you're more likely to know someone popular than be popular yourself. This is important :

...if we can use retrospective contact tracing to find the person who infected our patient, and then trace the forward contacts of the infecting person, we are generally going to find a lot more cases compared with forward-tracing contacts of the infected patient, which will merely identify potential exposures, many of which will not happen anyway, because most transmission chains die out on their own... “backward tracing increases this maximum number of traceable individuals by a factor of 2-3, as index cases are more likely to come from clusters than a case is to generate a cluster.”

Presumably this would also help identify if super spreaders are some particular demographic, people with naturally high infectivity, or driven by environment. If we knew that, we could pre-emptively tackle the worst spreading areas before they even begin.

This Overlooked Variable Is the Key to the Pandemic

Updated at 1:17 p.m. ET on October 1, 2020 There's something strange about this coronavirus pandemic. Even after months of extensive research by the global scientific community, many questions remain open. Why, for instance, was there such an enormous death toll in northern Italy, but not the rest of the country?

No comments:

Post a Comment

Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.

Dune part two : first impressions

I covered Dune : Part One when it came out, so it seems only fair I should cover the "concluding" part as well. I'm gonna do ...