Discover ANY AI to make more online for less.

select between over 22,900 AI Tool and 17,900 AI News Posts.


Wikipedia is struggling with voracious AI bot crawlers
Wikipedia is struggling with voracious AI bot crawlers

Wikimedia has seen a 50 percent increase in bandwidth used for downloading multimedia content since January 2024, the foundation said in an update. But it's not because human readers have suddenly developed a voracious appetite for consuming Wikipedia articles and for watching videos or downloading files from Wikimedia Commons. No, the spike in usage came from AI crawlers, or automated programs scraping Wikimedia's openly licensed images, videos, articles and other files to train generative artificial intelligence models. 
This sudden increase in traffic from bots could slow down access to Wikimedia's pages and assets, especially during high-interest events. When Jimmy Carter died in December, for instance, people's heightened interest in the video of his presidential debate with Ronald Reagan caused slow page load times for some users. Wikimedia is equipped to sustain traffic spikes from human readers during such events, and users watching Carter's video shouldn't have caused any issues. But "the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs," Wikimedia said.
The foundation explained that human readers tend to look up specific and often similar topics. For instance, a number of people look up the same thing when it's trending. Wikimedia creates a cache of a piece of content requested multiple times in the data center closest to the user, enabling it to serve up content faster. But articles and content that haven't been accessed in a while have to be served from the core data center, which consumes more resources and, hence, costs more money for Wikimedia. Since AI crawlers tend to bulk read pages, they access obscure pages that have to be served from the core data center. 
Wikimedia said that upon a closer look, 65 percent of the resource-consuming traffic it gets is from bots. It's already causing constant disruption for its Site Reliability team, which has to block the crawlers all the time before they they significantly slow down page access to actual readers. Now, the real problem, as Wikimedia states, is that the "expansion happened largely without sufficient attribution, which is key to drive new users to participate in the movement." A foundation that relies on people's donations to continue running needs to attract new users and get them to care for its cause. "Our content is free, our infrastructure is not," the foundation said. Wikimedia is now looking to establish sustainable ways for developers and reusers to access its content in the upcoming fiscal year. It has to, because it sees no sign of AI-related traffic slowing down anytime soon. This article originally appeared on Engadget at https://www.engadget.com/ai/wikipedia-is-struggling-with-voracious-ai-bot-crawlers-121546854.html?src=rss

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Wikipedia offers AI developers a training dataset to maybe get scraper bots off its back
Wikipedia offers AI developers a training dataset to maybe get scraper bots

<p>Wikipedia has been <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/ai/wikipedia-is-struggling-with-voracious-ai-bot-crawlers-121546854.html?_fsig=Wr5Dq_GeIVF_s2qP [...]

Match Score: 138.56

Why the Open Web Is at Risk in the Age of AI Crawlers
Why the Open Web Is at Risk in the Age of AI Crawlers

<img width="250" height="143" src="https://www.unite.ai/wp-content/uploads/2025/03/DALL·E-2025-02-27-20.27.42-A-futuristic-digital-landscape-where-AI-powered-web-crawlers- [...]

Match Score: 99.25

Volunteer photographers are fixing Wikipedia's terrible celebrity headshots
Volunteer photographers are fixing Wikipedia's terrible celebrity headshots

<p>Go to a profile of any celebrity on <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/study-shows-ai-program-could-verify-wikipedia-citations-improving-reliability- [...]

Match Score: 84.69

Cloudflare unveils AI maze to trap unwanted web crawlers
Cloudflare unveils AI maze to trap unwanted web crawlers

<p><img width="1999" height="1125" src="https://the-decoder.com/wp-content/uploads/2025/03/Cloudflare-AI-labyrinth.webp" class="attachment-full size-full wp [...]

Match Score: 70.89

League of Legends Season 2: Hello Brawls, bye-bye Voracious Atakhan
League of Legends Season 2: Hello Brawls, bye-bye Voracious Atakhan

<p>Riot Games has unveiled details about the next season for <em>League of Legends</em>. This chapter of the MOBA is themed Spirit Blossom Beyond, and it will bring a temporary new l [...]

Match Score: 66.14

BAFTA game awards (adorably) dominated by Astro Bot
BAFTA game awards (adorably) dominated by Astro Bot

<p>The 2025 BAFTA game awards took place yesterday in London and were dominated by one of our <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/gaming/playstation/best [...]

Match Score: 33.59

zdnet
AI bots scraping your data? This free tool gives those pesky crawlers the r

Cloudflare's AI Labyrinth has a message for bots: Get lost. Here's how to toggle on the tool. [...]

Match Score: 28.36

Devs say AI crawlers dominate traffic, forcing blocks on entire countries
Devs say AI crawlers dominate traffic, forcing blocks on entire countries

AI bots hungry for data are taking down sites by accident, but humans are fighting back. [...]

Match Score: 28.36

Doctor Who ‘The Robot Revolution’ review: Meet Belinda Chandra
Doctor Who ‘The Robot Revolution’ review: Meet Belinda Chandra

<p><strong><em>Spoilers for “The Robot Revolution.”</em></strong></p> <p>The start of any season of <em>Doctor Who</em> is important, doubly s [...]

Match Score: 28.13