The use of AI in data journalism : what are the ethical implications ?

Artificial intelligence is already being used in some newsrooms to mine data, create algorithms and automatically generate content. Using this technology on a daily basis raises new questions for journalists. Some experts claim that we are living a transitional phase, and that we have to make a decision about the future use of this technology in the media world, especially in data journalism.

The first edition of the ESMH Summer School – “AI and journalism” – raised many questions about the ethical implications of using AI technologies in journalism. After discovering how AI works and how it is being (and will be) deployed in the near future in the newsrooms, the 80 young journalists who participated in the summer school reflected upon impartiality, responsibilities and the potential limits of its use. We have talked to some experts to explore some ethical insight in data journalism.

On one side, we have interviewed Idoia Salazar, Professor at the School of Communication of San Pablo CEU University and Principal Investigator for Social Impact of Artificial Intelligence and Robotics (SIMPAIR). On the other side, we have talked to Nicholas Diakopoulos, Professor of Computational Journalism at Northwestern University Nicholas Diakopoulos.

What impact does technology have on ethics in journalism?

Idoia Ana Salazar García, professor at the School of Communication of San Pablo CEU University : “Some media organisations have tools for internal use rather than for writing news. The Los Angeles Times in the USA is one example. Media organisations do have some algorithms that gather data on a daily basis from databases, such as earthquake databases. They use these databases to write last minute reports. Another useful application of AI algorithms is documentation. (…) AI algorithms can help with the analysis of lots of data for a specific journalistic report as part of an investigation”.

Nicholas Diakopoulos, assistant Professor in Communication Studies and Computer Science at Northwestern University : “Machine learning has some fundamental implications for ethics in journalism. Take for instance the case of machine learning being used in a data mining context to identify an individual or example worth mentioning in a story. Since there will always be some statistical uncertainty associated with these methods, journalists need to understand the ethics of using uncertain statistical evidence. (…) Another case is the use of predictive modelling in journalism, such as the types of election models that 538 and the New York Times publish in the USA. Is it ethical for journalists to publish predictions on voting day that could impact voter turnout? Finally, there’s the question of transparency when journalists use opaque methods in their journalistic process”.

What are the main ethical implications of using AI in data journalism?

Experts agree that the biggest challenge for journalists using AI in their work is understanding the data they are using and knowing how to manage it.

Idoia Ana Salazar García : “The ethics ruling our profession will almost be the same. We will only have to worry about the reliability of the sources we use: that is to say, we will need to know which data is feeding the algorithm in order to understand its behaviour. The journalists are not using the data that feeds the algorithm, but they use the conclusions they draw from the data. (…)”

Nicholas Diakopoulos : “Some of the main issues, I think, relate to uncertainty around the accuracy of evidence produced by AI systems, as well as the labelling of automation to ensure that end users are aware of its use and therefore better understand potential errors or issues that may arise. Another issue is the quality of data that is fed into AI systems. It’s well understood that if biased data is fed into a machine learning system, the system will learn those biases. Journalists need to be acutely tuned to this issue so that their use of AI doesn’t simply replicate biases that are present in the datasets they use. They should be aware of the potential impact of the biases in the data on their use of an AI model trained on that data”.

Who will ultimately be responsible if information written by an AI system turns out to be false?

Idoia Ana Salazar García : “The responsibility would usually fall on the manager in a media case. In the end, the machine is going to be like another worker helping the journalist when it comes to making a decision. That is why educating and training media managers and editors is so important: decisions taken by algorithms can have significant social implications”.

Nicholas Diakopoulos : “We must always identify the person responsible for an AI system so they can be held accountable. It’s all too easy for organisations to blame the algorithm if something goes wrong, but these systems are all sociotechnical, comprised of both technical components and human components’.(…). Of course, which people to hold accountable is another difficult question”.

Would AI news articles be as credible as articles written by journalists?

Using AI in journalism calls the credibility of the media and journalists into question, especially regarding authorship.

Idoia Ana Salazar García : “I believe it can have even more credibility, taking into account the potential that artificial intelligence systems have to analyse much more information than a journalist can throughout his/her life”.

Nicholas Diakopoulos : “Some of the studies show that automatically produced articles using templates are slightly more credible than articles produced by human journalists. But when more sophisticated text generation techniques are used, they can introduce grammatical errors that undermine the credibility of a text”.

Should we establish some ethical limits for using AI in the newsroom, especially when using data as a source of information?

Both experts agree that it is very difficult nowadays to set ethical limits, aside from those that have always applied to all journalists.

Idoia Ana Salazar García : “I don’t think it’s good to set limits, but we need to analyse this type of technology. Journalists must know which database is feeding an algorithm that is going to be used for a certain type of media. There are certain precautions to be taken rather than limits to impose. I would say: ‘Beware of data! You have to be more cautious about more things”.

Nicholas Diakopoulos : “We’re still at a fairly early stage in the adoption of AI techniques in journalism and so it’s difficult to know precisely what all of the limits might be. However, I think it would be worthwhile for news organisations who use AI [should] hold discussions on acceptable use in relation to issues like accuracy and uncertainty, prediction, bylines and labelling, and biased data”.