Close Menu
CrypThing
  • Directory
  • Slot
  • News
    • AI
    • Press Release
    • Altcoins
    • Memecoins
  • Analysis
  • Price Watch
  • Price Prediction
Facebook X (Twitter) Instagram Threads
CrypThingCrypThing
  • Directory
  • Slot
  • News
    • AI
    • Press Release
    • Altcoins
    • Memecoins
  • Analysis
  • Price Watch
  • Price Prediction
CrypThing
Home»AI»Perplexity accused of scraping websites that explicitly blocked AI scraping
AI

Perplexity accused of scraping websites that explicitly blocked AI scraping

adminBy adminAugust 4, 2025Updated:August 5, 20253 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link Bluesky Reddit Telegram WhatsApp Threads
Perplexity accused of scraping websites that explicitly blocked AI scraping
Share
Facebook Twitter Email Copy Link Bluesky Reddit Telegram WhatsApp

AI startup Perplexity is crawling and scraping content from websites that have explicitly indicated they don’t want to be scraped, according to internet infrastructure provider Cloudflare.

On Monday, Cloudflare published research saying it observed the AI startup ignore blocks and hide its crawling and scraping activities. The network infrastructure giant accused Perplexity of obscuring its identity when trying to scrape web pages “in an attempt to circumvent the website’s preferences,” Cloudflare’s researchers wrote.

AI products like those offered by Perplexity rely on gobbling up large amounts of data from the internet, and AI startups have long scraped text, images, and videos from the internet many times without permission to make their products work. In recent times, websites have tried to fight back by using the web standard Robots.txt file, which tells search engines and AI companies which pages can be indexed and which shouldn’t, efforts that have seen mixed results so far.

Perplexity appears to be willingly circumventing these blocks by changing its bots “user agent,” meaning a signal that identifies a website visitor by their device and version type; as well as changing their autonomous system networks, or ASN, essentially a number that identifies large networks on the internet, according to Cloudflare.

“This activity was observed across tens of thousands of domains and millions of requests per day. We were able to fingerprint this crawler using a combination of machine learning and network signals,” read Cloudflare’s post.

Perplexity spokesperson Jesse Dwyer dismissed Cloudflare’s blog post as a “sales pitch,” adding in an email to TechCrunch that the screenshots in the post “show that no content was accessed.” In a follow-up email, Dwyer claimed the bot named in the Cloudflare blog “isn’t even ours.”

Cloudflare said it first noticed the behavior after its customers complained that Perplexity was crawling and scraping their sites, even after they added rules on their Robots file and for specifically blocking Perplexity’s known bots. Cloudflare said it then performed tests to check and confirmed that Perplexity was circumventing these blocks.

“We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked,” according to Cloudflare.

The company also said that it has de-listed Perplexity’s bots from its verified list and added new techniques to block them.

Cloudflare has recently taken a public stance against AI crawlers. Last month, Cloudflare announced the launch of a marketplace allowing website owners and publishers to charge AI scrapers who visit their sites. Cloudflare’s chief executive Matthew Prince sounded the alarm at the time, saying AI is breaking the business model of the internet, particularly publishers. Last year, Cloudflare also launched a free tool to prevent bots from scraping websites to train AI.

This is not the first time Perplexity is accused of scraping without authorization.

Last year, news outlets, such as Wired, alleged Perplexity was plagiarizing their content. Weeks later, Perplexity’s CEO Aravind Srinivas was unable to immediately answer when asked to provide the company’s definition of plagiarism during an interview with TechCrunch’s Devin Coldewey at the Disrupt 2024 conference.

accused AI blocked explicitly Perplexity scraping websites
Share. Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link Bluesky WhatsApp Threads
Previous ArticleGrok Imagine, xAI’s new AI image and video generator, lets you make NSFW content
Next Article Tonne Price Prediction: What will the price be like now?
admin

Related Posts

Taylor Swift fans accuse singer of using AI in her Google scavenger hunt videos

October 7, 2025

California’s new AI safety law shows regulation and innovation don’t have to clash 

October 6, 2025

These little robots literally walk on water

October 5, 2025
Trending News

The last call before the lift off? Dogecoin coil for important breakouts

October 3, 2025

How To Use A Bitcoin Heatmap For Smarter Trading Decisions

October 2, 2025

SK Planet Acquires MOCA Coin for Decentralized Identity Integration

October 2, 2025

Horizen (ZEN) gains 12% to break above $7

October 1, 2025
About Us

At crypthing, we’re passionate about making the crypto world easier to (under)stand- and we believe everyone should feel welcome while doing it. Whether you're an experienced trader, a blockchain developer, or just getting started, we're here to share clear, reliable, and up-to-date information to help you grow.

Don't Miss

Reporters found that Zerebro founder was alive and inhaling his mother and father’ home, confirming that the suicide was staged

May 9, 2025

Openai launches initiatives to spread democratic AI through global partnerships

May 9, 2025

Stripe announces AI Foundation model for payments and introduces deeper Stablecoin integration

May 9, 2025
Top Posts

The last call before the lift off? Dogecoin coil for important breakouts

October 3, 2025

How To Use A Bitcoin Heatmap For Smarter Trading Decisions

October 2, 2025

SK Planet Acquires MOCA Coin for Decentralized Identity Integration

October 2, 2025
  • About Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 crypthing. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.