Generative AI in science: What are the challenges and opportunities?

ESMH

1 year ago

Artificial intelligence in medicine. Laboratory worker and illustrations of DNA, double exposure. Banner design

Generative artificial intelligence (AI) has quickly become an important l scientific tool, yet its accelerating integration creates both opportunities and challenges. A workshop on 29 April 2025, organised by the European Parliament’s Panel on the Future of Science and Technology (STOA), will bring together MEPs, Commission representatives, and leading researchers to explore these tensions and chart a path forward that harnesses AI’s capabilities while mitigating risks.

“Certainly, here is a possible introduction for your topic…” began an article published in an Elsevier scientific journal in 2024. To regular ChatGPT users, this language is very familiar. The article, since retracted for using AI without disclosure, sparked debate about the use – and misuse – of Generative AI in science.

Generative AI is a branch of machine learning based on transformer models: a type of neural network architecture that can generate new output based on patterns in large amounts of training data. This includes Large Language Models (LLMs), such as ChatGPT, Claude, and Perplexity AI.

Scientists are increasingly using LLMs to help with everything from summarising and brainstorming, to editing, writing and even reviewing articles. At least 10% of abstracts published in 2024 on PubMed were written with LLMs, researchers estimate, and ChatGPT has been even listed as a co-author of several scientific papers.

The European Commission is currently writing a European AI in Science Strategy, aimed at accelerating the use of AI in science. Few contest that AI can aid scientific discovery: on stark display when Google DeepMind scientists won the Nobel Prize in Chemistry for developing AlphaFold2 – an AI program which solved the 50-year old problem of protein structure prediction.

The benefits of using generative AI models in science are more disputed. Concerns include hallucinations, academic misconduct, and potential copyright infringements. But the responsible use of generative AI could also increase the accessibility of science, improve scientific communication, and unleash creativity.

These topic are the focus of the upcoming STOA workshop ‘Generative AI and scientific development’ on 29 April. Ahead of the event, the European Science-Media Hub spoke with keynote speakers Professor Sonia Contera of the University of Oxford (UK), and Professor Serge Belongie of the University of Copenhagen (Denmark) and head of the Danish Pioneer Centre for Artificial Intelligence.

Hallucinations

AI hallucinations are one of the biggest worries concerning the use of generative AI in science. LLMs have a slippery relationship with truth: they have a persistent tendency to produce nonsensical and incorrect output such as made-up academic references, citing fake authors and titles. This conflicts with the scientific pursuit of accuracy.

Spanish physicist and nanotechnology expert Sonia Contera explains: “The main problem of generative AI is that it intrinsically generates error. There’s always about 15% errors in whatever you produce. It’s difficult because the AI itself does not know when it makes an error or not, so you need to supervise the results.” – Read the full interview with Sonia Contera

Given the tendency of LLMs to hallucinate, users need to exercise caution.

As Serge Belongie, professor in Machine Learning at the University of Copenhagen, argues, generative AI tools “must be used in the hands of scientists with high AI literacy. In the wrong hands, these techniques can produce wild goose chases with regrettable carbon footprints.” – Read the full interview with Serge Belongie

This view is echoed by the European Commission’s guidelines on the responsible use of generative AI in science, which urge researchers to “maintain a critical approach to using GenAI and continuously learn how to use it responsibly to gain and maintain AI literacy.”

AI slop, paper mills, and the unprecedented pace of AI-science

Another concern with LLMs in science is the risk of inaccurate AI-generated papers drowning out high-quality research.

Serge Belongie calls this the “AI slop problem”, involving “fake conference papers, confabulated data, plagiarism, and citations of non-existent references.” Generative AI will only worsen the pre-existing problem with ‘paper mills’: businesses that produce fraudulent, poor quality scientific manuscripts and sell the authorship. At risk is the credibility of science.

The sheer pace of AI-assisted science also poses a challenge. Even before the dawn of generative AI, there was an inflation in the number of scientific publications. Now that LLMs can write papers in seconds, this inflation is only set to worsen. Indeed, the number of publications could become so great that human researchers are unable to keep up, leaving them dependent on AI-generated summaries.

A ‘publish or perish’ scientific community, where scientists are incentivised to maximise the number of papers they write, exacerbate these risks. As Sonia Contera explains: “One of the metrics of success is the number of publications, which is a crazy thing in itself. We give people points for publishing, so people game the system. In this context, AI is making a mess. With AI, you can churn papers. So if the publication world was broken before, it’s going to be chaos now.”

AI reviewers

Increasingly, AI chatbots are also used to peer-review research papers. By tracing AI buzzwords such as ‘commendable’, ‘innovative’, ‘meticulous’ in 146,000 peer reviews before and after the launch of ChatGPT, researchers found that 17% of peer-review reports are now substantially modified by chatbots. With the rise of AI reviewers, many fear that live scientific dialogue is shifting to an automatic exchange where science is conducted by machines for machines.

Time-pressure on scientists is adding to the problem. As the lead author of the study above, Weixin Liang, a computer scientist at Stanford University, notes: “It seems like when people have a lack of time, they tend to use ChatGPT.” Sonia Contera concurs: “Scientists are overworked, constantly asked to review proposals. So of course, people are just putting these proposals into ChatGPT.”

The automation of peer-review is, in turn, likely to reduce incentives for scientists to write papers themselves. According to Serge Belongie, if “reviewers get lazy or sloppy with AI tools, it erodes trust and saps motivation from scientists who may already feel overworked and underpaid.” This could in turn increase the use of AI-generated papers, fueling a vicious cycle.

Copyright issues

Another risk with AI reviewers concerns copyright. LLMs are being given access to confidential and unpublished material. As Sonia Contera notes, “We are giving away our IP. It is a huge risk. When you store your proposal in a cloud repository, you don’t know the terms and conditions. Do they own it?”

According to the EC guidelines on generative AI in Science, researchers should “refrain from using GenAI tools in sensitive activities e.g. peer reviews or evaluations.” However, avoiding the use of generative AI entirely risks foreclosing opportunities for AI to improve peer review. LLMs tools can tidy up peer-review feedback, and spot errors in research papers. Generative AI – used responsibility – offers many positive opportunities, which we now turn to.

Improved accessibility and communication

LLMs can lower barriers of entry to science, allowing researchers with limited budgets to make progress on core scientific issues. They can also aid multidisciplinary research by helping scientists to quickly familiarise themselves with new fields.

Serge Belongie explains: “LLMs are significantly lowering the barrier of entry for people to write code for everything from statistical analysis to iPhone apps. As a result, a molecular biologist, for example, with no background in Computer Science, can now write nontrivial code to assist in drug discovery”.

Generative AI can also aid scientific communication. In the blink of an eye, LLMs can adapt the presentation of scientific results to different audiences. Tools such as Google’s NotebookLM, for instance, can generate a two-host podcast from a scientific paper in mere seconds. In addition, LLMs can help non-native English speaking researchers read and communicate in English by providing grammar and vocabulary suggestions.

Unleashing creativity

Some researchers believe that AI could fully automate the process of scientific discovery. For instance, Sakana AI recently made headlines for producing an entirely AI-generated article that passed peer-review at an international AI conference.

Sonia Contera and Serge Belongie agree that LLMs can support the creative process of science, by helping to brainstorm ideas and quickly summarising information. “Even when it makes a mistake, it can be useful. It’s like an additional dialog with someone,” Sonia Contera adds.

They both emphasise that AI is a human tool, not a substitute, however. As Serge Belongie points out, “AI is at a stage of development in which it can assist in the creative process, by suggesting hypotheses or promising directions of solution, but it’s still the case that clever people are needed to orchestrate such systems […] the path to complete automation is still science fiction.”

Boosting the responsible use of generative AI in science

How, then, can we ensure that generative AI is used more responsibly in science? From talking to Serge Belongie and Sonia Contera, addressing flawed metrics of success in science and improving working conditions emerge as key areas.

Improving AI training is another. LLMs offer many opportunities – but only in the hands of AI literate scientists. For scientific publishers and reviewers, clear transparency requirements on how AI was used to write and/or review an article look like the way forward, as opposed to outright AI bans.

At EU-level, policymakers are recognising the importance of developing generative AI to strengthen European values and ensure digital sovereignty. The European Commission’s recent AI continent action plan includes ambitions to set up at least 13 AI factories and five AI gigafactories across Europe to ensure researchers have access to sufficient computing power. This will be crucial in helping European scientists to reap the rewards of generative AI.

As Serge Belongie explains: “Training Large Models – for language, vision, or other modalities – is notoriously compute-hungry. This means that European research labs need low-friction access to supercomputing facilities. One solution is to lower barriers to supercomputing access, e.g., via the AI Factories Initiative.”

Sonia Contera recommends that the EU invest in new, more energy-efficient computing systems to address the energy demands of generative AI. She also points to the need for strengthened private-public partnerships on AI and increased funding for risky, innovative science.

Perhaps most importantly, Prof. Contera says “the EU needs to stay guided by democratic values. Policymakers should focus on what is good for actual citizens in Europe, not just for themselves or not for the companies that are lobbying.”

Share this: