Join us
The Media Today

Just add watermarks

October 31, 2024
(Photo by Jakub Porzycki/NurPhoto via AP)

Sign up for The Media Today, CJR’s daily newsletter.

Last week, DeepMind, Google’s AI research lab, made a tool for identifying AI-generated text generally available—a widely appreciated step toward greater levels of accountability in an industry struggling with misattribution. The tool, called SynthID, builds on a growing research area: watermarking, or the idea of adding hidden patterns to digital content that reliably identify them as having been made using AI. Google has been rolling out SynthID for the past year; in May, the company announced that it was applying the tool to its Gemini app, an AI chatbot, and Veo, its AI video generation model. Now developers will be able to use the technology in their own models for free. 

Regulators and researchers have been eyeing watermarking as a potential flame-reducer for AI harms like copyright infringement and the generation of disinformation and fake news sites. But while it is a step in the right direction, SynthID could face the same challenges as many other AI safety initiatives. Determined individuals who want to circumvent the system often find a way.

SynthID is the latest milestone in a long history of watermarking. The concept dates back to thirteenth-century Italy, when paper manufacturing became popular in Europe; papermakers discovered that, by varying the thickness of the paper pulp, they could create unique marks that were visible when held up to light. (The watermarks were used to identify the trade guild that produced the paper and to prevent forgeries.) As information transitioned from paper to digital in the late twentieth century, so did watermarks. Digital watermarking doesn’t actually involve water (surprise); instead, it typically involves embedding a digital code or image within multimedia content, be it visible or hidden. Typically, digital watermarks have been used to prevent piracy, but they’ve also been used to hide sensitive information: for instance, the idea behind one of the first digital watermarking techniques was to hide the identity of patients in medical images. This early kind of digital watermarking, termed “spread spectrum watermark,” spreads the watermark across a wide range of frequencies in a way that’s imperceptible. 

Much like physical watermarking, AI watermarking embeds a signature or code into content to indicate its origin. To create a sequence of text, large language models predict the next most likely element based on their training data; SynthID embeds its “signature” by adjusting the probability of certain words in such a way that the change is detectable to software but not to humans. For instance, Harry Potter might be the most likely end to the following phrase: “My favorite book is __.” But a digital watermark will shift the probabilities of the possible answers so that, say, Lord of the Rings comes out the winner. The process is repeated throughout the entire text so that a single sentence can contain up to ten or more adjusted probability scores. A watermark detection classifier is then trained on lots of watermarked and non-watermarked text, enabling it to recognize the difference between the two and pick up on the digital signature. 

While SynthID has been in the works for over a year, it wasn’t until last week that Google opened up the underlying code, via the data-science platform Hugging Face. (The DeepMind researchers detailed their work in a paper published in Nature.) Making the tool publicly available may serve as a nudge to other major AI companies to incorporate watermarking into their own models. “With better accessibility and the ability to confirm its capabilities, I want to believe that watermarking will become the standard, which should help us detect malicious use of language models,” João Gante, a machine-learning engineer at Hugging Face, told the MIT Technology Review

One of the most promising features of SynthID, at least according to Google, is that the watermarking process doesn’t require much computing power or compromise the quality of AI output. To demonstrate this latter point, the DeepMind researchers conducted a live experiment that assessed feedback from nearly twenty million chat responses generated by the Gemini model (users can provide feedback by pressing a thumbs-up or -down button). The findings showed that users’ satisfaction with the output didn’t change based on whether the content was watermarked or not. 

If SynthID is resilient, it is not bulletproof. According to an editorial in Nature, “it is still comparatively easy for a determined individual to remove a watermark and make AI-generated text look as if it was written by a person.” SynthID can withstand common modifications like cropping, filtering, color adjustments, and compression, but is less robust when the text is translated or heavily rewritten. The DeepMind researchers also noted that SynthID doesn’t work as well on strictly factual information, since there is less word-probability wiggle room. (For instance, there aren’t a lot of options to end the phrase: “the capital of France is __.”) Still, SynthID is a big improvement over some other available AI-detection tools. Since ChatGPT first made waves, in late 2022, hundreds have emerged on the market, including software aimed at helping educators identify when students used a chatbot to complete an assignment. Such software has helped deter cheating, but according to the Washington Post, it has also sometimes produced false positives, wrongly accusing innocent students. 

Sign up for CJR’s daily email

Of course, watermarking tools will only become useful if companies agree to use them. In the summer of 2023, the White House secured voluntary commitments from several tech companies, including Google and OpenAI, to self-manage AI risks and develop “robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system.” OpenAI developed a system about a year ago for watermarking ChatGPT-generated text and a tool to detect watermarks, according to the Wall Street Journal. But the company has reportedly been sitting on the tool due to internal division about whether to release it. A survey conducted by OpenAI found that nearly a third of loyal ChatGPT users would be “turned off” by the anti-cheating technology. “In trying to decide what to do, OpenAI employees have wavered between the startup’s stated commitment to transparency and their desire to attract and retain users,” the Journal reports. 

Whether tech companies like it or not, there are some efforts, in other places, to make watermarking mandatory. In May, the European Parliament approved the Artificial Intelligence Act, which requires tech companies to label deepfakes and AI-generated content. The act also requires companies to develop AI-generated media in a way that makes detection possible. California followed suit last month, passing a bill that cracks down on sexually explicit deepfakes and requires developers to include watermarking in AI-generated content. And in China, new regulations go as far as punishing social media platforms when AI-generated content is widely disseminated without being properly classified, according to Wired

If watermarking regulations continue to be passed, the underlying research could see a boom—one that, according to the Nature editorial, is much needed. “The work is an important step forwards,” Nature wrote, referring to DeepMind, “but the technique itself is in its infancy. We need it to grow up fast.” If watermarks can become “watertight,” so to speak, then that could help shine a light on the explosion of AI-generated fake news that has propelled disinformation across the Web. Of course, their existence could also reassure readers that an article has indeed been written by the human journalist stated in the byline, and not by an AI prone to bias and hallucinations. 


Other notable stories:

  • CJR asked foreign correspondents who are covering the US election for audiences back in their home countries to describe their experiences on the trail. Richard Hall, of the British newspaper The Independent, writes about the fascinating “contrast between the power bestowed upon the victor and the places where that victory is won,” often small towns and hamlets in swing states. Tiffany Weiyang Le, of the Singapore-based Chinese-language outlet Initium Media, is focusing her reporting on Chinese Americans “who have not traditionally been very active in US elections, perhaps for reasons of historical racism or cultural difference,” but are now getting more involved. Meanwhile, Kourosh Ziabari, a freelance Iranian journalist, is covering relations between his country and the US, amid concern that “none of the dominant narratives seem to signal peace.”
  • And Commonweal, which describes itself as America’s oldest lay-edited Catholic opinion journal, published a special issue marking its centennial. The publication faces a big question: “What does it mean to be a leading intellectual voice of liberal Roman Catholicism at a moment of continuing loss of faith in both the church and liberalism?” Jennifer Schuessler writes in the New York Times—but for its fans, it remains “a stalwart defender of a liberal Catholic intellectual tradition that is embattled but hardly dead.”

Has America ever needed a media defender more than now? Help us by joining CJR today.

Sarah Grevy Gotfredsen is a computational investigative fellow at the Tow Center for Digital Journalism at Columbia University. She works on a range of computational projects on the digital media landscape, including influence operations conducted through news media and the information ecosystem. She graduated from Columbia University in 2022 with an MS degree in data journalism.