As fake news threatens to eat the world, a number of government, academic, and private organizations have launched a range of projects to combat its spread, including artificial intelligence systems that can flag fake news by calling out its content. Scientists at the Massachusetts Institute of Technology (MIT) and a research center in Qatar have added to that effort with a machine learning algorithm they say can help marginalize fake news by taking an approach familiar to anyone who has dealt with implausible gossip: Consider the source.
Researchers at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) and the Qatar Computing Research Institute (QCRI) have demonstrated a new machine learning system that assesses the accuracy and/or biases of a source, such as a website, as a way to measure the quality of its content. The system works fairly quickly as well, researchers say, which would be an improvement over current anti-fake news methods, which include armies of human fact checkers running individual claims to the ground or AI machines that attempt to do the same. Whichever method is employed, false and misleading stories can circle the digital globe several times before fact checkers can catch up.
The MIT-Qatar system starts with the credibility of the source. “If a website has published fake news before, there’s a good chance they’ll do it again,” Ramy Baly, a post-doctoral MIT researcher and lead author of a paper about the system, said in MIT News. “By automatically scraping data about these sites, the hope is that our system can help figure out which ones are likely to do it in the first place.”
Researchers said the system can reliably assess a website’s trustworthiness based on about 150 articles, which they took from Media Bias/Fact Check (MBFC), whose human fact checkers rate more than 2,000 large and small news sites.
The most effective clues to bias were found in the consistent linguistic features of a site, including sentiment, complexity, and sentence structures, the researchers said. Purveyors of fake news, for instance, tended to employ hyperbolic, subjective, and emotional language more than other websites. They also found consistent linguistic features separating the political ends of the spectrum: left-leaning sites focus more on issues of harm and care, fairness, and reciprocity as opposed to markers such as loyalty, sanctity, and authority, researchers said.
The research team points out that the system is a work in progress with more improvements coming, but in gauging the accuracy of news sources according to MBFC’s high-medium-low scale, it has been 65 percent accurate in gauging factuality and about 70 percent in determining if a site leans right, left, or moderate.
The research team has built an open-source data set of more than 1,000 sources that includes ratings for accuracy and bias, which it says is the largest database of its kind in the world. Researchers also are working on adapting the system to languages other than English, and drilling down beyond right/left biases to specific areas. Even with the planned improvements, however, they said it still will work best with other fact-checking systems.
Quite a few anti-fake news initiatives have gotten underway in the past several years, including those run by journalism organizations, media leaders, technology companies, and others. The Fake News Challenge, for one, enlists volunteers to develop tools to help fact checkers speed up their work. Stanford University is trying to develop an AI system that can identify fake news based on its content. And the Army Research Laboratory is leading research into the human element of the fake news problem, examining how a pre-existing bias, or lack thereof, impacts how people interpret the news they see.
Fake news isn’t exactly new; the Founding Fathers had plenty of complaints about what they saw as the press peddling fiction as fact. But with the power of the internet, the emergence of AI techniques, and an apparent lowering of standards by political leaders and participants, its persistent presence threatens to overwhelm fact-based debate. If fake news is to be abated, it will be essential to identify it before it gains much traction.