Sign up for The Media Today, CJRâs daily newsletter.
univac, the first general-purpose digital computer, was put into production in 1951. By 1952 journalists at CBS were already using it to predict the outcome of the presidential election based on early vote returns. They even featured it live on air:
Almost seventy years later, journalists are still at it.
In the past decade the early successes of Nate Silver and his site FiveThirtyEight have helped solidify data-driven election prediction as an importantâeven expectedâproduct from major newsrooms. Weâve also seen predictive data journalism applied to a growing array of other topics, from sports to culture and business. The New York Times recently published an interactive model exploring how much worse the novel coronavirus could get in the absence of swift action.
And yet a recent study found that only about 5 percent of data journalism projects have some kind of projective outlookâan eye on the future. As predictive journalism expands beyond its roots in elections into a variety of other social domains, what should journalists be thinking about to use it ethically in their practice?
Predicting aspects of social life is quite a different challenge from, say, forecasting the weather. Weather predictions donât have to take into account the idiosyncratic behavior of individuals, who make their own choices but are also subject to influence. This creates a new ethical dilemma for journalists, who must reckon with how their news organizationsâ behaviorâpublishing or framing information in particular waysâmight influence how predicted events unfold.
Consider the potential impact of election predictions in 2016, many of which had Hillary Clinton as the clear favorite. Itâs possible that individual Hillary supporters saw those predictions and thought something like, âSheâs got this one in the bag. Iâm busy on Tuesday, and my vote wonât be decisive anyway, so I donât need to vote.â According to one study, election predictions may indeed depress voter turnout, depending on how those predictions are presented to people.
The important point here is that the act of publication may create a feedback loop that dampens (or amplifies) the likelihood of something actually happening. News organizations that publish predictions need to be aware of their own role in influencing the outcome they are predicting.
Of course, publishing a prediction about a sporting event may influence behavior (e.g., betting) that affects individuals but doesnât rise to the level of affecting society, while an election or public health prediction could be far more influential.
LISTEN: A visit to an ER covid-19 unit
News organizations should be thinking carefully about how they expect predictions to be used by readers. How might a published prediction change individual behavior in the future? What individual decisions might it affect, and what are the implications of a mistaken prediction that misleads someone? Journalists need to think through the social dynamics of their projections.
Letâs look at a recent, timely example through this lens: the Timesâ piece on âHow Much Worse the Coronavirus Could Get, in Charts.â
Those charts depict the projected peak number of infections, the peak number of ICU cases, and the total number of deaths as a function of when and how severely interventions are taken. Interactivity allows the user to explore how an earlier, later, milder, or more aggressive intervention strategy might change those outcomes.
The article clearly articulates its goals, quoting epidemiologist Ashleigh Tuite, who helped develop the model. âThe point of a model like this is not to try to predict the future but to help people understand why we may need to change our behaviors or restrict our movements, and also to give people a sense of the sort of effect these changes can have,â Tuite says. Here, the modelâs predictions are explicitly about changing behaviorâhelping readers (both citizens and policymakers) to see the positive implications of acting immediately and aggressively to âflatten the curve.â
The article hedges about the uncertainty in the model, suggesting that warmer spring weather could affect outcomes in unknown ways. Communicating the uncertainty of a prediction, while challenging, can help soften the ostensible authority of a mathematical model. Conveying uncertainty can take various forms, including textual insinuations of contingency, confidence bounds on charts, explanations of probabilities, and articulation of multiple potential outcomes. Journalists need to develop more ways to do this well.
The Times article could go further in examining the implications of the modelâs assumptions and in considering competing interests (e.g., individual liberty and freedom of movement, the health of the economy) at varying levels of intervention. And while there is a fair degree of transparency in terms of how the model was parameterized (e.g., a 1 percent case fatality rate was assumed), the authors could provide another layer of higher-fidelity transparency for the true wonks. Pyramids of transparency information can help suit different levels of interest.
ICYMI: The Tow Centerâs covid-19 newsletter
Although the accuracy of the model is only truly knowable in retrospect, making the nuts and bolts of its process visible can at least help readers put predictions in perspective. If a model is built on a set of flimsy assumptions, readers can be appropriately skeptical of what it tells them.
One organization practicing exceptional transparency in predictive modeling for the 2020 elections is the Washington Post. In addition to blog posts detailing its election modeling, the Post publishes in-depth academic papers, and even the code for some models. As data scientist Lenny Bronner writes, âItâs important to explain what our models can and cannot do.â Itâs not that everyone will look at all that information, but that itâs available for inspection to the few who really want to kick the tires. (Disclosure: I spent the fall of 2019 on sabbatical at the Post.)
Notably, the Timesâ covid-19 model sits in the opinion section, as do models from the Post related to predicting the Democratic primary. Statistical models, and their predictions, are interpretations of data that contain a variety of subjective decisions. At the same time, this isnât really the same kind of individual subjectivity you typically find in an opinion piece. Careful modeling is closer to a form of analysisâa well-grounded interpretation based on evidence and data. Additional transparency can expose the subjectivity in that interpretation, as can end-user interactivity with some of the subjective modeling decisions or parameters.
As predictions grow into and beyond their journalistic roots in elections, transparency, uncertainty communication, and careful consideration of the social dynamics of predictive information will be essential to their ethical use. We should expect the experiences of data journalists to coalesce into a set of ethical expectations and norms. Weâre not there yet, but perhaps one day there will even be a style guide for predictive journalism.
Has America ever needed a media defender more than now? Help us by joining CJR today.