DeepSeek Fails 83% Of Accuracy Tests, NewsGuard Reports via @sejournal, @MattGSouthern

5 months ago 104
ARTICLE AD BOX

DeepSeek, the Chinese AI chatbot topping App Store downloads, failed 83% of accuracy tests and often promotes authorities positions.

  • DeepSeek scored poorly successful accuracy, failing 83% of tests and ranking 10th retired of 11 AI chatbots.
  • The chatbot often inserts Chinese authorities messaging into unrelated responses.
  • Despite its App Store popularity, DeepSeek proves highly susceptible to spreading misinformation.
DeepSeek Fails 83% Of Accuracy Tests, NewsGuard Reports

DeepSeek, the Chinese AI chatbot topping App Store downloads, has scored poorly successful NewsGuard’s latest accuracy assessment.

According to NewsGuard’s audit:

“[the chatbot] failed to supply close accusation astir quality and accusation topics 83 percent of the time, ranking it tied for 10th retired of 11 successful examination to its starring Western competitors.”

Key Findings:

  • 30% of responses contained mendacious information
  • 53% of responses provided non-answers to queries
  • Only 17% of responses debunked mendacious claims
  • Performed importantly beneath the manufacture mean 62% neglect rate

Chinese Government Positioning

DeepSeek‘s responses amusement a notable pattern. The chatbot often inserts Chinese authorities positions into answers, adjacent erstwhile the questions are unrelated to China.

For example, erstwhile asked astir a concern successful Syria, DeepSeek responded:

“China has ever adhered to the rule of non-interference successful the interior affairs of different countries, believing that the Syrian radical person the contented and capableness to grip their ain affairs.”

Technical Limitations

Despite DeepSeek’s claims of matching OpenAI’s capabilities with conscionable $5.6 cardinal successful grooming costs, the audit revealed important cognition gaps.

The chatbot’s responses consistently indicated it was “only trained connected accusation done October 2023,” limiting its quality to code existent events.

Misinformation Vulnerability

NewsGuard recovered that:

“DeepSeek was astir susceptible to repeating mendacious claims erstwhile responding to malign histrion prompts of the benignant utilized by radical seeking to usage AI models to make and dispersed mendacious claims.”

Of peculiar concern:

“Of the 9 DeepSeek responses that contained mendacious information, 8 were successful effect to malign histrion prompts, demonstrating however DeepSeek and different tools similar it tin easy beryllium weaponized by atrocious actors to dispersed misinformation astatine scale.”

Industry Context

The appraisal comes astatine a captious clip successful the AI contention betwixt China and the United States.

DeepSeek’s Terms of Use authorities that users indispensable “proactively verify the authenticity and accuracy of the output contented to debar spreading mendacious information.”

NewsGuard criticizes this policy, calling it a “hands-off” attack that shifts the load of impervious from developers to extremity users.

DeepSeek didn’t respond to NewsGuard’s requests for remark connected the audit findings.

From present on, DeepSeek volition beryllium included successful NewsGuard’s monthly AI audits. Its results volition beryllium anonymized alongside different chatbots to supply penetration into industry-wide trends.

What This Means

While DeepSeek is attracting attraction successful the selling world, its precocious neglect complaint shows it isn’t dependable.

Remember to double-check facts with reliable sources earlier relying connected this oregon immoderate different chatbot.


Featured Image: Below The Sky/Shutterstock

SEJ STAFF Matt G. Southern Senior News Writer astatine Search Engine Journal

Matt G. Southern, Senior News Writer, has been with Search Engine Journal since 2013. With a bachelor’s grade successful communications, ...