ÍøÆØÃÅ

AI summary ¡®trashed author¡¯s work¡¯ and took weeks to be corrected

<ÍøÆØÃÅ class="standfirst">Study findings misrepresented in experimental Q&A published with paper, amid concerns efforts to save researchers time are fuelling mistakes
April 24, 2025
The crew of the HMS Narvik watch the smoke rise after a British atomic test, which took place on the Montebello Islands off the west coast of Australia, 16 May 1956.
Source: Central Press/Getty Images

When doctoral student Madison Williams-Hoffman notched her first lead-author article in a peer-reviewed journal last September, it was a milestone in the young chemist¡¯s career. The in the Journal of Environmental Radioactivity, a title of the world¡¯s biggest and most profitable academic publisher, investigated the ecological fallout from Britain¡¯s 1950s atomic bomb tests in Western Australia¡¯s Montebello Islands.

Williams-Hoffman¡¯s team collected 11 seabed samples from three locations in the remote archipelago of sandy tropical cays. The target material ranged from radioactive specks, less than a tenth of a millimetre in diameter, to leftover grains from the bombs and detonation infrastructure. The scientists wanted to find out how mobile these particles were and the extent to which they could be ingested by the marine life, which ranges from seagrasses, mangroves, corals and sponges to lobsters, shorebirds, turtles, dugongs, sharks and whales.

Ten of the samples were separated into 50 smaller batches for spectrometric analysis of plutonium isotopes. The 11th?sample ¨C from Claret Bay, well south of the test sites ¨C was also analysed as an environmental control. The research confirmed the persistence of radioactive particles, highlighting the need for further research into the region¡¯s ¡°nuclear legacy¡±.

In March, Williams-Hoffman was surprised to discover that the online version of the paper contained an AI-generated question and answer section immediately below the abstract. She was even more surprised to read its claim that the paper was based on just three measurements, not 51. The AI had apparently confused the methodology of Williams-Hoffman¡¯s study with earlier research she had cited.

ÍøÆØÃÅ

ADVERTISEMENT

This was a serious error, particularly when the relatively new paper had its best prospects of being read, cited and acted on. Three measurements would be nowhere near adequate to support the study¡¯s conclusions, she explained.

¡°If¡­a paper in my field only analysed three samples, I would toss it,¡± she said. ¡°Researchers can spend years of time and taxpayer money to come to meaningful scientific conclusions [which are] trashed, essentially, by AI-generated content.¡±

ÍøÆØÃÅ

ADVERTISEMENT

Williams-Hoffman said she did not know how long the incorrect information had been online, because she had not been notified about the Q&As. It took several weeks and many communications to have the information corrected, before the entire Q&A section was subsequently deleted ¨C again, without Williams-Hoffman¡¯s knowledge.

Times Higher Education asked the publisher, Elsevier, why the Q&As not been checked for accuracy and why the corresponding author had not been informed of their inclusion or given any ready means of correcting them.

A spokeswoman said Elsevier ran regular tests of generative AI, ¡°with human oversight¡±, to gauge its potential to support researchers. The Q&As had resulted from one such trial and had been removed ¡°once the short-lived test was complete¡±.

She said the Q&As had been clearly marked as AI-generated content, ¡°while inviting users to rate the quality of the response and learn more. Users were also provided with links to the sources, encouraging them to verify the AI content themselves.¡±

ÍøÆØÃÅ

ADVERTISEMENT

Elsevier recently introduced AI tools that generate summaries of content drawn from its Scopus abstract and citation database and its ScienceDirect platform of full-text articles and book chapters. The publisher says such innovations could halve the time researchers spend on literature searches.

But concerns have been raised about the accuracy of AI summaries. A recent study found that AI apps, particularly the newest models, tended to exaggerate scientific findings.

A colleague of Williams-Hoffman, who researches the ecological impacts of bushfires, raised similar mistakes in an AI summary of one of his papers but the editorial department of the journal that published the work never responded. ¡°For the entire time they were live, they were incorrect,¡± he said.

¡°I certainly think this is a grey zone of intellectual property. There appears to be an auto-include to specific journals and no way to correct the misinformation. It also begs the question why we write highlights and abstracts if they¡¯re going to AI-summarise the paper anyway.¡±

ÍøÆØÃÅ

ADVERTISEMENT

john.ross@timeshighereducation.com

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please
or
to read this article.
<ÍøÆØÃÅ class="pane-title"> Related articles
<ÍøÆØÃÅ class="pane-title"> Reader's comments (2)
Where was the journal's editors? The publisher's representatives? All texts must be checked, AI or human or, or, or......
new
It is clear that Elsevier's response was from its AI. No other explanation. Do any humans work for Elsevier in 2025?
<ÍøÆØÃÅ class="pane-title"> Sponsored
<ÍøÆØÃÅ class="pane-title"> Featured jobs
See all jobs
ADVERTISEMENT