Technology

Twitter will ban ‘deceptive’ faked media that could cause ‘serious harm’

Twitter will ban faked pictures, video, and other media that are “deceptively shared” and pose a serious safety risk. The company just announced a new policy on synthetic and manipulated media — a category that encompasses sophisticated deepfake videos, but also low-tech deceptively edited content. In addition to banning egregious offenders, Twitter will label some tweets as “manipulated media” and link to a Twitter Moment that provides more context. This process will begin about a month from now, on March 5th.

The new rules apply to media that has been “significantly and deceptively altered or fabricated” by any means. That could include anything that’s been spliced, clipped, or overdubbed in a way that substantially changes its meaning, or any fabricated footage depicting a real person. Twitter isn’t banning this kind of altered media, but it may label it as fake and provide more context.

Twitter will crack down harder on manipulated media if it’s presented as truth or “likely to impact public safety or cause serious harm.” Content that meets one of these criteria will probably be labeled and may be removed; if it meets both criteria, it’s very likely to be removed. “Each of our rules is meant to prevent or mitigate a known, quantifiable harm,” said Twitter trust and safety VP Del Harvey on a call with reporters. “We think about the likelihood and severity of harm that could result and the best ways to mitigate that harm.”

Twitter proposed a manipulated media policy last year, and it based the new rules on comments it received after that announcement, as well as consultations with academic experts. Twitter head of site integrity Yoel Roth confirmed that the rules would apply to some high-profile misleading content — like a tightly cut clip of Vice President Joe Biden talking about race. “Selective editing and cropping is something we consider to be media manipulation,” said Roth.

Twitter will look at the accompanying tweet text and account information, among other factors, to judge whether something is misleading. “Our goal in making these assessments is to understand whether someone on Twitter who is just scrolling through their timeline has enough information to understand whether the media being shared in a tweet is or isn’t what it claims to be,” said Roth. Labeled tweets could be marked with a flag and a warning before other users like or retweet them. Twitter could also choose not to recommend them, and it could link people to a landing page with more information. If a user feels their tweet has been unfairly labeled, they can appeal the decision.

Facebook and YouTube (among other platforms) already provide fact-checking recommendations for potentially misleading content. But where those platforms often prioritize specific trusted sources like fact-checking sites or Wikipedia, Twitter is apparently taking the same approach it does with Moments — which feature a hand-picked selection of tweets from across the platform.

“The format that we’re using in our product to curate these sources is Moments,” said Roth. “While we’re talking to a number of potential partners who we think have specific expertise in the area of media authenticity, we wouldn’t just be looking to feature tweets from only a select number of partners.”

The manipulated media policy intersects with some existing rules; Twitter says the most common deepfake-style content is nonconsensual pornography, for example, which is already banned. It will also create some gray areas, though. Modified subtitles or a voiceover count as manipulation. But an inaccurate caption may not, although miscaptioned photographs — like a tweet claiming an anti-ISIS rally is actually pro-ISIS — are one of Twitter’s biggest misinformation vectors.

Like a lot of content moderation rules, this framework seems aimed at addressing specific high-profile problems on the platform, not running a scorched-Earth campaign against fake photos and video. Twitter is careful to say it “may” label misleading content, not that it will do so across the board. And it’s focusing on the overall site experience more than the intent of individual actors or the methods they use.