Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby

Tuesday, 31 July 2018

Spotting fake news via algorithm

The answer is a cautious, "probably", but the details of how the algorithms work are very interesting. It reminds me of algorithms used to search HI data cubes, of all things - automatic techniques are really only semi-automatic. You use them to spot a possible source, then get a human to check. From the original data set there's no foolproof criteria to judge if a source is real - the ultimate verification is to do a repeat observation and see if a source is still there. With news, that step would have to be an on-site inspection or contacting the affected people directly. Like HI source verification, short of looking again for ourselves we don't have an objective standard of truth. But we may generate criteria for a better initial guess, minimising the amount of follow-up time required.

When it comes to inspecting news content directly, there are two major ways to tell if a story fits the bill for fraudulence: what the author is saying and how the author is saying it. Ciampaglia and colleagues automated this tedious task with a program that checks how closely related a statement’s subject and object are. To do this, the program uses a vast network of nouns built from facts found in the infobox on the right side of every Wikipedia page — although similar networks have been built from other reservoirs of knowledge, like research databases.

Writing style may be another giveaway. Compared with real news, false articles tended to be shorter and more repetitive with more adverbs. Fake stories also had fewer quotes, technical words and nouns... The fake news in this analysis also tended to use more positive language and express more certainty.

Oher telltale signs of false news might be much harder to manipulate — namely, the kinds of audience engagement these stories attract on social media. Cao’s team built a system that could round up the tweets discussing a particular news event, then sort those posts into two groups: those that expressed support for the story and those that opposed it. The system considered several factors to gauge the credibility of those posts. If, for example, the story centered on a local event that a user was geographically close to, the user’s input was seen as more credible than the input of a user farther away. If a user had been dormant for a long time and started posting about a single story, that abnormal behavior counted against the user’s credibility. By weighing the ethos of the supporting and the skeptical tweets, the program decided whether a particular story was likely to be fake.

Li and colleagues studied the shapes of repost networks that branched out from news stories on social media. Li’s team found most people tended to repost real news straight from a single source, whereas fake news tended to spread more through people reposting from other reposters. A typical network of real news reposts “looks much more like a star, but the fake news spreads more like a tree,” Li says. This held true even when Li’s team ignored news originally posted by well-known, official sources, like news outlets themselves. Reported March 9 at arXiv.org, these findings suggest that computers could use social media engagement as a litmus test for truthfulness, even without putting individual posts under the microscope.

Even as algorithms get more astute at flagging bogus articles, there’s no guarantee that fake news creators won’t step up their game to elude detection. If computer programs are designed to be skeptical of stories that are overly positive or express lots of certainty, then con authors could refine their writing styles accordingly. “Fake news, like a virus, can evolve and update itself,” says Li.

This last point may be particularly interesting. Anything that evolves must propagate. And to propagate, fake news must attract viewers. How does it do that ? Through rhetorical incitement of emotion. An often overlooked point about fake news is that yes, lies can spread more quickly than truths. But this isn't because there's anything special about lies per se, it's just that it's easier to construct a lie which excites the senses than find a new truth which does the same. Universally appealing truths do also go viral, it just happens less because they're rarer.

So I'm wondering if you have an mechanism which is reasonably good at spotting fake news by its rhetorical appeal and propagation, and if you find some way to remove/reduce/suppress it, where would it go from there ? Clearly it would then have to sell itself by some other method, but making it seem less emotionally appealing won't work because then it would no advantage over actual news stories. What would be the alternative ? Centralised sites won't work either, since spotting a liar is often easier than spotting a lie (under the assumption that we could appoint people who aren't so idiotic that they think that obvious liars are decent people). It seems to me there are relatively few avenues. I suppose one option might be subtle manipulation of real news stories, resharing them with minor modifications to give a different message than intended. This would have to be very careful in order to actually get people to believe them (e.g., it's obvious to anyone when you meant to say, "would" as opposed to "wouldn't). Another option might be to play the "merchants of doubt" card : rather than suggesting alternative facts, undermine the notion of objective truth by exaggerating scope for disagreement. This then makes people much more vulnerable to whatever sources of alternative facts are still available.

To help sort fake news from truth, programmers are building automated systems that judge the veracity of online stories. A computer program might consider certain characteristics of an article or the reception an article gets on social media. Computers that recognize certain warning signs could alert human fact-checkers, who would do the final verification.

Automatic lie-finding tools are “still in their infancy,” says computer scientist Giovanni Luca Ciampaglia of Indiana University Bloomington. Researchers are exploring which factors most reliably peg fake news. Unfortunately, they have no agreed-upon set of true and false stories to use for testing their tactics. Some programmers rely on established media outlets or state press agencies to determine which stories are true or not, while others draw from lists of reported fake news on social media. So research in this area is something of a free-for-all.

But teams around the world are forging ahead because the internet is a fire hose of information, and asking human fact-checkers to keep up is like aiming that hose at a Brita filter. “It’s sort of mind-numbing,” says Alex Kasprak, a science writer at Snopes, the oldest and largest online fact-checking site, “just the volume of really shoddy stuff that’s out there.”

https://www.sciencenews.org/article/can-computer-programs-flag-fake-news

1 comment:

  1. The point that many lies do die is underappreciated. They make it up on volume. Truth is constrained.

    ReplyDelete

Due to a small but consistent influx of spam, comments will now be checked before publishing. Only egregious spam/illegal/racist crap will be disapproved, everything else will be published.

An Astonishing Level of Humanisation

I've mentioned the difficulties of both promoting/censoring violent action on social media before  and I can't really think of much ...