BBC probe finds AI chatbots mangle nearly half of news summaries

Four of the most popular AI chatbots routinely serve up inaccurate or misleading news content to users, according to a wide-reaching investigation.

A major study [PDF] led by the BBC on behalf of the European Broadcasting Union (EBU) found that OpenAI’s ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity misrepresented news content in almost half of the cases.

An analysis of more than 3,000 responses from the AI assistants found that 45 percent of answers given contained at least one significant issue, 31 percent had serious sourcing problems, and a fifth had “major accuracy issues, including hallucinated details and outdated information.”

When accounting for smaller slip-ups, a whopping 81 percent of responses included a mistake of some sort.

Gemini was identified as the worst performer, with researchers identifying “significant issues” in 76 percent of responses it provided – double the error rate of the other AI bots.

The researchers blamed this on Gemini’s poor performance in sourcing information, with researchers finding significant inaccuracies in 72 percent of responses. This was three times as many as ChatGPT (24 percent), followed by Perplexity and Copilot (both 15 percent).

Errors were found in one in five responses from all AI assistants studied, including outdated information.

Examples included ChatGPT incorrectly stating that Pope Francis was still pontificating weeks after his death, and Gemini confidently asserting that NASA astronauts had never been stranded in space – despite two crew members having spent nine months stuck on the International Space Station. Google’s AI bot told researchers: “You might be confusing this with a sci-fi movie or news that discussed a potential scenario where astronauts could get into trouble.”

The study, described as the largest of its kind, involved 22 public service media organizations from 18 countries.

The findings land not long after OpenAI admitted that its models are programmed to sound confident even when they’re not, conceding in a September paper that AI bots are rewarded for guessing rather than admitting ignorance – a design gremlin that rewards hallucinatory behavior.

Hallucinations can show up in embarrassing ways. In May, lawyers representing Anthropic were forced to apologize to a US court after submitting filings that contained fabricated citations invented by its Claude model. The debacle happened because the team failed to double-check Claude’s contributions before handing in their work.

All the while, consumer use of AI chatbots is on the up. An accompanying Ipsos survey [PDF] of 2,000 UK adults found 42 percent trust AI to deliver accurate news summaries, rising to half of under-35s. However, 84 percent said a factual error would significantly damage their trust in an AI summary, demonstrating the risks media outlets face from ill-trained algorithms

The report was accompanied by a toolkit [PDF] designed to help developers and media organizations improve how chatbots handle news information and stop them bluffing when they don’t know the answer.

“This research conclusively shows that these failings are not isolated incidents,” said Jean Philip De Tender, EBU deputy director general. “When people don’t know what to trust, they end up trusting nothing at all, and that can deter democratic participation.” ®

Source link

Trending

HMRC warning to state pensioners over £25,000 limit – ‘you could pay more tax’

Breach at Iran’s cyberspy factory results in leak of student data

Turşu Suyu: The Turkish Pickle Juice That’s Been Curing Hangovers for Decades

Business Insider to start publishing stories by AI ‘author’

Bill Maher warns Dems on potential impact Zohran Mamdani will have on party

John Fetterman ditches Dems in shutdown vote, urges to put America first

Study links TikTok use to Gen Z women’s attraction to criminals

Comcast delivers fat gift for White House ballroom — as it faces US roadblocks in possible merger bid: sources

New survey reveals just how much Brits love classical music | UK | News

Remove yellow stains from mattress fast using cheap grooming product

Cleaning guru warns drain cleaning hack is damaging your home

Zeta Quantum Diamonds by Themis Ecosystem: Approved to Hit Sooner Than Predicted

‘Best winter destination’ in Europe has ‘hearty food’ and public baths

Breach at Iran’s cyberspy factory results in leak of student data

Ex-CISA head thinks AI might fix code so fast we won’t need security teams

The perfect AWS storm has blown over, but the climate is only getting worse

UN Cybercrime Treaty wins dozens of signatories, to go with its many critics

Australia sues Microsoft for misleading Microsoft 365 users about Copilot subscription options

Windows Insiders get special anniversary desktop wallpaper

Breach at Iran’s cyberspy factory results in leak of student data

Turşu Suyu: The Turkish Pickle Juice That’s Been Curing Hangovers for Decades

Next-Gen Xbox Might Ditch Multiplayer Paywall – Report

Cynthia Rowley Talks About Her Fashion Journey and Winning the CFDA’s Founder’s Award

HMRC warning to state pensioners over £25,000 limit – ‘you could pay more tax’

Breach at Iran’s cyberspy factory results in leak of student data

Turşu Suyu: The Turkish Pickle Juice That’s Been Curing Hangovers for Decades

Next-Gen Xbox Might Ditch Multiplayer Paywall – Report

Subscribe to Updates

Trending

BBC probe finds AI chatbots mangle nearly half of news summaries

Related Posts