Discover ANY AI to make more online for less.

select between over 22,900 AI Tool and 17,900 AI News Posts.


Researchers find just 250 malicious documents can leave LLMs vulnerable to backdoors
Researchers find just 250 malicious documents can leave LLMs vulnerable to backdoors

Artificial intelligence companies have been working at breakneck speeds to develop the best and most powerful tools, but that rapid development hasn't always been coupled with clear understandings of AI's limitations or weaknesses. Today, Anthropic released a report on how attackers can influence the development of a large language model.The study centered on a type of attack called poisoning, where an LLM is pretrained on malicious content intended to make it learn dangerous or unwanted behaviors. The key finding from this study is that a bad actor doesn't need to control a percentage of the pretraining materials to get the LLM to be poisoned. Instead, the researchers found that a small and fairly constant number of malicious documents can poison an LLM, regardless of the size of the model or its training materials. The study was able to successfully backdoor LLMs based on using only 250 malicious documents in the pretraining data set, a much smaller number than expected for models ranging from 600 million to 13 billion parameters. "We’re sharing these findings to show that data-poisoning attacks might be more practical than believed, and to encourage further research on data poisoning and potential defenses against it," the company said. Anthropic collaborated with the UK AI Security Institute and the Alan Turing Institute on the research.This article originally appeared on Engadget at https://www.engadget.com/researchers-find-just-250-malicious-documents-can-leave-llms-vulnerable-to-backdoors-191112960.html?src=rss

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Engadget Podcast: iPhone 16e review and Amazon's AI-powered Alexa+
Engadget Podcast: iPhone 16e review and Amazon's AI-powered Alexa+

<p>The keyword for the <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/mobile/smartphones/iphone-16e-review-whats-your-acceptable-compromise-020016288.html"> [...]

Match Score: 85.05

The Morning After: Our verdict on the Pixel 10 Pro Fold
The Morning After: Our verdict on the Pixel 10 Pro Fold

<p>A little after the launch of the rest of the Pixel 10 family, Google’s new foldable is here. The <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/mobile/smartpho [...]

Match Score: 82.58

venturebeat
MCP stacks have a 92% exploit probability: How 10 plugins became enterprise

<p>The same connectivity that made <a href="https://www.anthropic.com/news/model-context-protocol">Anthropic&#x27;s Model Context Protocol (MCP)</a> the fastest-adopted [...]

Match Score: 69.38

Researchers secretly experimented on Reddit users with AI-generated comments
Researchers secretly experimented on Reddit users with AI-generated comment

<p>A group of researchers covertly ran a months-long "unauthorized" experiment in one of Reddit’s most popular communities using AI-generated comments to test the persuasiveness of l [...]

Match Score: 60.22

venturebeat
ACE prevents context collapse with ‘evolving playbooks’ for self-improv

<p>A new framework from <a href="https://www.stanford.edu/"><u>Stanford University</u></a> and <a href="https://sambanova.ai/"><u>SambaNov [...]

Match Score: 49.22

venturebeat
Meta’s new CWM model learns how code works, not just what it looks like

<p><a href="https://www.meta.com/">Meta</a>’s AI research team has released a new large language model (LLM) for coding that enhances code understanding by learning not o [...]

Match Score: 47.30

venturebeat
New AI training method creates powerful software agents with just 78 exampl

<p>A new study by <a href="https://en.sjtu.edu.cn/"><u>Shanghai Jiao Tong University</u></a> and <a href="https://plms.ai/"><u>SII Generat [...]

Match Score: 45.88

venturebeat
Self-improving language models are becoming reality with MIT's updated SEAL

<p>Researchers at the Massachusetts Institute of Technology (MIT) are gaining renewed attention for developing and <a href="https://github.com/Continual-Intelligence/SEAL/blob/main/LICEN [...]

Match Score: 43.83

Research Suggests LLMs Willing to Assist in Malicious ‘Vibe Coding’
Research Suggests LLMs Willing to Assist in Malicious ‘Vibe Coding’

<img width="250" height="143" src="https://www.unite.ai/wp-content/uploads/2025/05/anachronism-MAIN-1-250x143.jpg" class="webfeedsFeaturedVisual wp-post-image&quo [...]

Match Score: 43.07