Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Monday 27 August 2018

Protracted problems caused by keeping code private

Originally shared by Eli Fennell

How A Software Bug Fueled a 7-Year Scientific Debate About Water

In 2011, a team of researchers at Berkeley, led by the renowned chemist David Chandler, announced the results of model simulations their team had performed on supercooled pristine water: at every temperature measured, they claimed, the liquid water remained fairly homogeneous in structure. In short, their models suggested that, if kept pure of dust and other contaminants, liquid water remained the same until reaching a rapid freezing point.

Given his widespread acclaim, Chandler's teams findings were widely accepted, much to the chagrin of a team of researchers at Princeton, led by Pablo Debenedetti. The Princeton Team, seemingly using the exact same modeling, had come to a quite different conclusion: at supercool temperatures, before freezing into ice crystals (which, in pristine water, do not form until temperatures are extremely supercool due to a lack of impurities for ice crystals to form on, but form quickly when they finally do form), the water appeared to take on two different liquid states, high-density and low-density. As this resembled the transitional stage at higher temperatures where liquid water and water vapor become indistinguishable, they took this to indicate a low temperature critical period between solid and liquid water.

Seeing the issue of getting different results from apparently the same data and modeling, the two teams began to collaborate, trying to find the source of the disagreement. This fell apart, though, when the Princeton team continued to publish results, before a consensus on the source of the disagreement could be reached. The relationship between the two then turned contentious, and at times openly hostile, with each side accusing the other of an error, but neither side agreeing on what that error might be.

Eventually, Debenedetti had an insight: although the two teams had used the same mathematical models to simulate supercooled water behaviors, Chandler's team had used an algorithm to speed up their processing, allowing them to run simulations over longer periods. While Chandler believed this gave their results an edge, subsequent speed up efforts by Debenedetti had matched them for duration, but not matched their results. It had, thus, occurred to Debenedetti that there may be a bug in the Berkeley code.

Had Chandler and his team quickly turned over their code for inspection, the matter could have been resolved quickly, but instead it would be another couple of years before they published it. When they did, Debenedetti and his team were vindicated: the Berkeley algorithm used an unconventional and, as it turned out, improper technique to initialize their molecular dynamics simulations, which among other things inflated the simulated temperatures by tens of degrees.

As a result, Chandler's team had failed to see the transitional state of high- and low-density liquids, as one would fail to see water turn to ice at the freezing temperature if your thermometer was in fact wrong and the real temperature was above that. In failing to publish their code right away, Chandler's team had, in fact, violated the scientific principles of transparency and reproducibility, since no one could truly have replicated their findings. In the process, they had wasted a lot of time and energy, for themselves and other researchers.

While opinion has now shifted in favor of Debenedetti's simulations, it is worth noting that despite 7-years of debate over this, neither side in the end has yet proven anything. Their simulations were just that, simulations, using models of liquid molecular dynamics known to be imperfect (as all such models are). Only real tests, on real pristine water, at the right temperatures will resolve this.

While there can be no doubt that algorithms will play an increasingly invaluable role in scientific research, they are no less in need of transparency than any other material or method used in conducting research. One cannot argue effectively against results from a Black Box (i.e. a mechanism whose inner means of operation are opaque), after all.

https://physicstoday.scitation.org/do/10.1063/PT.6.1.20180822a/full/

No comments:

Post a Comment

Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.

Review : Human Kind

I suppose I really should review Bregman's Human Kind : A Hopeful History , though I'm not sure I want to. This was a deeply frustra...