AI Has Started Cleaning Up Facebook, but Can It Finish?

Artificial intelligence has proved effective at keeping nudity and pornography off of Facebook. But recognizing hate speech and bullying is a much tougher task.
Image may contain Paper
Hotlittlepotato

In the early hours of Aug. 25, 2017, a ragged insurgent group from Myanmar’s Rohingya Muslim minority attacked military outposts in the country’s northwest, killing 12 people. Security forces quickly retaliated with a campaign of village burning and mass killings that lasted weeks. As Rohingya died by the thousands, Myanmar’s military leaders took to Facebook.

A post from the commander-in-chief pledged to solve “the Bengali problem,” using a pejorative for Rohingya in Myanmar. Another general wrote to praise the “brilliant effort to restore regional peace,” observing that “race cannot be swallowed by the ground but only by another race.” A UN fact-finding report on the violence later cited the commander-in-chief’s post as suggestive of genocide, and noted the history of Facebook posts whipping up hate against Rohingya in Myanmar. The mission’s chair told journalists that the site had played a “determining role” in the crisis.

In the US Capitol in April, Senator Jeff Flake asked Facebook CEO Mark Zuckerberg how his company might have avoided that role. The impassive then-33-year-old billionaire noted that he had hired more Burmese speakers. Then he expounded on a favorite topic—artificial intelligence. “Over the long term, building AI tools is going to be the scalable way to identify and root out most of this harmful content,” he said. During two days of congressional hearings, Zuckerberg mentioned AI more than 30 times. It would, he told lawmakers, fight fake news, prevent ads that discriminate on the grounds of race or gender, and hobble terrorist propaganda.

Facebook has faced a dizzying series of accusations and scandals over the past year. They include enabling Russian election interference and employment discrimination, in addition to being accessory to genocide in Myanmar. Monday, a Senate report said Russia's activities on Facebook properties were far greater than previously known, and suggested the the company misled Congress by downplaying the idea that Russian trolls used its product to suppress turnout during the 2016 presidential election.

Many of Facebook’s apologies exhibit a common theme: Artificial intelligence will help solve the problems incubating on the company's platform. Mike Schroepfer, the company’s chief technology officer, says the technology is the only way to prevent bad actors from taking advantage of the service. With 2.3 billion regular users, having everything reviewed by humans would be prohibitively expensive—and creepy. “I think most people would feel uncomfortable with that,” Schroepfer says, eliding the possibility users may find it creepy to have algorithms review their every post. “To me AI is the best tool to implement the policy—I actually don't know what the alternative is.”

Facebook CTO Mike SchroepferPATRICIA DE MELO MOREIRA/AFP/Getty Images

Counting on AI is a gamble. Algorithms have proved capable of helping to police Facebook, but they are far from a cure-all—and may never be. The company has had great success in detecting and blocking pornography and nudity. But training software to reliably decode text is much more difficult than categorizing images. To tamp down harassment, hate speech, and dangerous conspiracy theories across its vast platform, Facebook needs AI systems capable of understanding the shifting nuances of more than 100 different languages. Any shortfalls must be caught by Facebook’s roughly 15,000 human reviewers, but at the social network’s scale it’s unclear how manageable their workload will be. As events in Myanmar showed, gaps in the enforcement net that may look small from Menlo Park can feel dangerously large to people whose world is being shaped by Facebook.

Flesh Detector

Facebook’s push to automate its content moderation started on the initiative of an ad executive, not an expert in online discourse. Tanton Gibbs was hired as an engineering director in 2014 to work on ad technology, as he had previously at Microsoft and Google. After hearing about Facebook’s moderation challenges, he suggested a more algorithms-first approach. Facebook had adopted a tool called PhotoDNA developed by Microsoft and Dartmouth College to block known child exploitation images, but wasn’t deploying image-analysis software or AI more broadly. “They were strictly using humans to review reports for things like pornography, hate speech, or graphic violence,” says Gibbs. “I saw we should automate that.” Facebook put Gibbs at the head of a new team, based in Seattle, known initially as CareML.

The new group quickly proved its worth. Gibbs and his engineers embraced a technology called deep learning, an approach to training algorithms with example data that had recently become much more powerful. Google showed the power of the technology when it developed software that learned to recognize cats. More quietly, Gibbs’ group taught deep learning algorithms to recognize pornography and nude human beings. Initially that software reviewed images flagged by Facebook users. After a year and a half, Gibbs got permission to let his systems flag newly submitted content before anyone reported it. Facebook says 96 percent of adult and nude images are now automatically detected, and taken down, before anyone reports them.

That’s still a lot of nude flesh slipping past Facebook’s algorithms. The company says it took down 30.8 million images and videos of nudity or sexual activity in the third quarter of 2018; that means the algorithms didn’t catch 1.3 million such images. In fact, Facebook estimates that the percentage of views with nudity or sexual content nearly doubled over the 12 months ending in September, to about 9 in every 10,000 views. “More nudity was posted on Facebook, and our systems did not catch all of it fast enough to prevent an increase in views,” Facebook said in its most recent community standards enforcement report. How much was posted and seen but not detected or reported is unknowable.

Still, the success of Gibbs’ project in fighting pornography has become a favorite talking point of Facebook executives touting the potential of AI to clean up their service. It’s working proof of the idea that an algorithmic immune system can help shelter Facebook users from harmful content—and the company from the consequences of hosting it. Facebook says that just over half of hate speech removed from the platform in the most recent three months was flagged first by algorithms, more than double the proportion earlier in the year. Some 15 percent of posts removed for bullying are identified and taken down before anyone has reported them. In neither case, though, do the algorithms remove the post; the programs flag the posts to be reviewed by people.

Facebook’s challenge is getting its technology to work well enough that its roughly 15,000 human reviewers can reliably pick up the slack, in each of the more than 100 countries and languages the service is used. Getting its hate speech and bullying detectors close to the effectiveness and autonomy of its porn filters will be particularly difficult.

Deep learning algorithms are pretty good at sorting images into categories—cat or car, porn or not porn. They’ve also made computers better with language, enabling virtual assistants like Alexa and significant jumps in the accuracy of automatic translations. But they’re still a long way from understanding even relatively simple text in the way humans do.

Decoding Language

To understand whether a post reading “I’m going to beat you” is a threat or a friendly joke, a human reviewer might effortlessly take into account whether it was paired with an image of a neighborhood basketball court, or the phrasing and tone of earlier messages. “How a model could use context in that way is not understood,” says Ruihong Huang, a professor at Texas A&M University. She helped organize an academic workshop on using algorithms to fight online abuse this fall, at one of the world’s top conferences for language processing research. Attendance and the number of papers submitted roughly doubled compared with the event’s debut in 2017—and not because researchers smelled victory. “Many companies and people in academia are realizing this is an important task and problem, but the progress is not that satisfying so far,” says Huang. “The current models are not that intelligent in short, that’s the problem.”

Srinivas Narayanan, who leads engineering in Facebook’s Applied Machine Learning group, agrees. He’s proud of the work his team has done on systems that can scan for porn and hate speech at huge scale, but human-level accuracy and nuance remains a distant hope. “I think we’re still far away from being able to understand that deeply,” he says. “I think machines can eventually, but we just don’t know how.”

Facebook has a large, multinational AI lab working on long term, fundamental research that may one day help solve that mystery. It also has journalists, lawmakers, civil society groups, and even the UN expecting improvements right now. Facebook’s AI team needs to develop tricks that can deliver meaningful progress before the next scandal hits.

The products of that push for practical new AI tools include a system called Rosetta announced this year that reads out text that is embedded in images and video, allowing it to be fed into hate speech detectors. (There’s evidence some online trolls are already testing ways to trick it.) Another project used billions of hashtags from Instagram users to improve Facebook’s image recognition systems. The company has even used examples of bullying posts on Facebook to train a kind of AI-powered cyberbully, which generates text generator to push its moderation algorithms to get better. The company declined to provide WIRED a sample of its output.

One big challenge for these projects is that today’s machine learning algorithms must be trained with narrow, specific data. This summer, Facebook changed how some of its human moderators work, in part to generate more useful training data on hate speech. Instead of using their knowledge of Facebook’s rules to decide whether to delete a post flagged for hate speech, workers answered a series of narrower questions. Did the post use a slur? Does it make reference to a protected category? Was that category attacked in this post? A reviewer could then scan through all the answers to make the final call. The responses are also useful feedstock for training algorithms to spot slurs or other things for themselves. “That granular labeling gets us really exciting raw training data to build out classifiers,” says Aashin Gautam, who leads a team that develops content moderation processes. Facebook is exploring making this new model permanent, initially for hate speech, and then perhaps for other categories of prohibited content.

Elsewhere, Facebook is trying to sidestep the training data problem. One lesson from the tragic events in Myanmar is that the company needs to get better at putting humans and software in place to understand the language and culture of different markets, says Justin Osofsky, a vice president who runs global operations.

The conventional approach to training algorithms to decode text in multiple languages would be extremely expensive for Facebook. To detect birthday greetings or hate speech in English, you need thousands, preferably millions of examples. Each time you want to expand to a new language, you need a fresh set of data—a major challenge for a company of Facebook’s scale.

As a solution, Facebook is adapting systems built for common languages such as English or Spanish to work for less common languages, like Romanian or Malay. One approach involves using automated translation. Facebook has been able to suppress clickbait in languages including Hungarian and Greek in part by converting posts into English so they can be fed into clickbait detectors trained on US content. It also conjures up new training sets for less common languages by translating English ones. Another project involves creating multilingual systems primed on deep similarities between languages, meaning that once trained on a task in English, they can instantly do the same thing in Italian, too. “These multilingual approaches have really helped accelerate our ability to apply AI to integrity problems across languages,” says Narayanan.

The project also helps illustrate the scale of Facebook’s challenge. So far, its multilingual workarounds don’t work on languages for which the company has relatively small datasets, such as Burmese. The same challenge exists for Hausa, a West African language used in campaigns of anti-Muslim hate speech that local police told the BBC last month have led to more than a dozen murders. Facebook says it is expanding its relationship with Nigerian fact checking organizations and NGOs—as well as its use of machine learning to flag hate speech and violent images.

Invited to look ahead, Schroepfer, Facebook’s chief technology officer, concedes that preventing incidents like that from ever happening is impossible. “One question I often ask myself is what other endeavors of equivalent complexity have a 100 percent safety record,” he says. “I can’t think of one. Aircraft, cars, space travel, law enforcement. Do you know any city that’s got a zero crime rate or is on the path to that?”

All the same, he remains optimistic enough about Facebook’s path to imagine a day when its algorithms are so effective that bullying and hate speech virtually vanish. “My hope is that it in two or three or five years there is so little of it on the site that it’s sort of ridiculous to argue that’s it having a big effect on the world,” Schroepfer says. A techie can dream.


More Great WIRED Stories