Q&A: The Backfire Effect—Is Fact-Checking Doing More Harm Than Good?

Sign up for The Media Today, CJR’s daily newsletter.

Two doom-laden predictions dominated practically every preview of the 2024 United States elections. First, artificial intelligence would “turbocharge political misinformation”—flooding the public sphere with ever more sophisticated fakes and falsehoods. Second, these deceptions would influence the outcome of the elections unless effectively countered by the most comprehensive fact-checking operation ever embarked upon.

However, with the election just days away, the tone has somewhat changed—replaced by a growing sense that the most dystopian prophecies were wrong, or at least exaggerated. Of course, there have been attempts to sow disinformation among the electorate—whether those originate from Russian troll farms or come directly from the Republican candidate.

But while falsehoods abound, AI-based deceptions do not seem to be manipulating voter beliefs or behavior in the ways some feared. That said, it remains unclear whether efforts to combat mis- and disinformation more broadly are having their desired effect. Sure, few (if any) people genuinely believe AI depictions of Haitian immigrants “eating the pets.” And yet those false claims endure and spread—at least among large portions of the electorate—despite the best efforts of everyone from Kamala Harris to the local police chief. What’s more, such attempts to persuade voters that they are being misled may well be making the situation worse.

To find out more, Tow sat down with Dr. Yamil Velez, an assistant professor of political science at Columbia University, to discuss why fact checkers can make misinformation more impactful when they strike the wrong tone, as well as how AI tools are transforming the ways in which researchers examine voter behavior.

Our conversation has been edited for length and clarity.

Your most recent paper probes the so-called “backfire” thesis: the argument that attempting to correct misinformation can cause respondents to double down on their views, essentially making the problem worse. Where do your findings sit within the debate?

There was a 2010 article that created really significant conversation—not only within academia, but the broader news community—about whether you could effectively combat misinformation. The researchers focused on a variety of issues, one of which was the weapons of mass destruction claim that emerged in the mid-2000s. They found that, basically, if you tried to correct people who are extremely conservative and told them that no WMDs were found in Iraq, that actually increased their belief in the misinformation. This prompted all of these concerns that there could be pieces of misinformation that you shouldn’t actively correct, because you could make the problem worse.

What’s interesting, though, is that there was a follow-up study with a similar design conducted by Ethan Porter and Thomas Wood. They examined fifty-two issues and found zero instances of backfire.

Why do you think the 2010 study identified “backfire” as a significant problem?

One perspective is that WMDs were still hotly debated while the study was being conducted—or, at least, it was still fresh in people’s minds—and maybe they still had a motivation to defend that position. Another perspective is that sometimes we have statistical flukes in research, and that’s why it’s important to replicate. Sometimes you can find evidence of a pattern that you know may have been a one-in-twenty event.

I ended up collaborating with Porter and Wood during the rollout of the COVID-19 vaccines, and I bought the perspective that backfire was rare, but I was skeptical that partisans across the aisle would respond similarly to corrective information. My expectation was that you’re going to correct Republicans on vaccines and COVID-19 and there’s going to be far less movement. They might not backfire, but they’re not going to be as receptive relative to Democrats. I expected the same thing would hold for vaccine skeptics too. We didn’t find that. We actually found very similar rates of [views] updating in response to the corrective information—something I was not expecting.

This ended up aligning with work that was coming out around the same time by a scholar at Yale, Alex Coppock, who was finding that persuasive messages tend to move people to a similar degree—regardless of their predispositions. So, generally, if something has a persuasive effect that shifts attitudes in the direction of a message by, say, 5 percent, then that tends to hold among Democrats, Republicans, and independents—even if they have different baselines. So my conclusion from the COVID-19 studies was that backfire is rare, and people are actually receptive to corrective information.

Your most recent study considered how strongly respondents held the views that would be challenged. How did you use large language models (LLMs) to do this?

In the aftermath of these COVID studies I had a very optimistic take: people are correctable. What was lingering in the back of my head was that we were correcting viral pieces of misinformation. Some of these claims might not be things that people deeply care about, or are integral to the way they think about politics. They might just be some stray claim they’ve seen online. And so, yes, you can correct it—but it’s not going to ultimately affect people’s core political beliefs. What I wanted to do was conduct an even harder test of whether people are correctable or not.

There was a study in 2006 which argued that anytime you try to expose people to counter-attitudinal information, it tends to reinforce their predispositions. The study created this theoretical framework to understand why backfire might occur. And part of their story was, you have to be motivated to defend your position for this polarization to be observable. You can imagine most people have a neutral position on an issue. If they get corrected, they’re not going to backfire. They’re not going to try to defend the position. But if it’s something integral about how you think about politics—something like, let’s say, your abortion views or views on immigration—correction might be more difficult.

And so, as LLMs started getting better, I started toying around with [coauthor] Patrick Liu—taking these models and tailoring information to participants. We thought this would be a great test of the hypothesis that backfire was only observed when people cared deeply about a topic, and so we asked people, “What’s an issue that you prioritize, or you care about deeply?” And then we were able to provide people with these tailored counterarguments.

How useful did you find the LLMs?

The biggest critique I’ve received in doing this kind of work—especially when you’re creating perspectives and messages—is that [LLM-produced content] sometimes comes across as artificial and anodyne in ways that human speech does not. But I’m not convinced. We did a lot of tests—like comparing the persuasiveness of human arguments versus AI-produced arguments—and they were roughly comparable. And there had been work conducted around the same time that looked at just the persuasive effects of these messages. They were pretty persuasive relative to an ordinary person coming up with arguments.

I see significant benefits in these tools. I don’t think we should cede ground to people who might use them for nefarious purposes. We should study how these tools work in part because it might help us combat the more malicious use cases. A recent paper by Rand, Pennycook, and Costello asked people to disclose conspiracy theories that they might hold, and they used AI to effectively debunk these conspiracy theories. The researchers found some pretty sizable effects that persisted months later. You can imagine a world where the LLMs themselves start inserting debunks or flags if people are discussing content that’s either blatantly false or coming from an inauthentic source.

You used the LLM to adapt the messaging content for latter groups, making it more “emotionally charged” so that it “directly attacks what the person said.” Is that why you were able to identify the backfire effect?

The instruction for the LLM in the earlier groups was to provide a counterargument. For the latter two studies, it was to provide an affectively charged counterargument. What we noticed is that there was more moralizing language inserted—like “this position is wrong” or “that’s absurd”—that we didn’t observe in the earlier studies, where it was very much just logic-based: very anodyne, neutral, reminiscent of a fact check. And what we found in the [latter] cases was the backfire pattern. We found that people actually increased their attitude strength with respect to this issue position that they disclosed.

We think about those findings as a nice reconciliation of the debates that have emerged within political psychology over whether people are correctable: with studies looking at fact checks on one hand—finding correction effects—and earlier studies that actually used more affectively charged counterarguments, which did find evidence of backfire. We see this as speaking to the broader discussion of whether persuasion can be hindered when vitriol and incivility is introduced. We’re going to do some follow-up studies, but I think that if you raise people’s defenses, they might be less persuadable.

In late 2023 you coauthored a paper about AI and the upcoming US election. How have your observations tallied with what you thought back then?

I think dystopian predictions of us being inundated with AI content have not really been borne out. I think the expectation that, as we were approaching the election, it would be hard for people to even tell what’s real or not hasn’t really materialized. That’s not to say that it won’t be an issue in the future, but I will say, going into the 2024 election, I haven’t seen any cases of AI-generated content that really moved the needle.

Some of the instances of AI content have been, I think, pretty blatantly false. The Taylor Swift endorsement; Kamala [Harris] standing at the DNC with a hammer and sickle behind her.… In some ways that is filling the same role as memes, and I think we should maybe have a broader discussion about—not so much whether you can fool people with generative AI, but the fact that memes or cartoons can sometimes have the same persuasive effects as the strategic use of generative AI. If the message is there, the message might be powerful enough on its own, and it might not need generative AI to complement it.

Q&A: The Backfire Effect—Is Fact-Checking Doing More Harm Than Good?

About the Tow Center

About

Support CJR

Advertise