AI company Perplexity is sneaking to get around blocks on crawlers, Cloudflare alleges

Artificial intelligence startup Perplexity is using stealthy techniques to get around network blocks against systematic browsing and scraping of web pages, Cloudflare said Monday in a blog post.
The alleged activity prompted Cloudflare, which received complaints from its customers, to take action against Perplexity.
“There are clear preferences that crawlers should be transparent, serve a clear purpose, perform a specific activity, and, most importantly, follow website directives and preferences,” Cloudflare engineers wrote. “Based on Perplexity’s observed behavior, which is incompatible with those preferences, we have de-listed them as a verified bot and added heuristics to our managed rules that block this stealth crawling.”
It’s the latest step from Cloudflare in its approach to crawling from AI systems, following last month’s announcement allowing customers to block or charge fees from web crawlers deployed to scrape their websites and data.
Customers who disallowed Perplexity crawling activity in their robots.txt files — a file that instructs search engine crawlers which parts of a website they can and cannot access — told CloudFlare that Perplexity was still able to access their content.
“These customers told us that Perplexity was still able to access their content even when they saw its bots successfully blocked,” Cloudflare said. “We confirmed that Perplexity’s crawlers were in fact being blocked on the specific pages in question, and then performed several targeted tests to confirm what exact behavior we could observe.”
Emails to a Perplexity spokesperson and media email address seeking a response were not immediately answered. But spokesperson Jesse Dwyer told TechCrunch that Cloudflare’s blog post was no more than a “sales pitch,” that the screenshots in the post “show that no content was accessed” and the bot Cloudflare named “isn’t even ours.”
Perplexity has encountered allegations of unethical web scraping in the past. Most recently, the BBC has threatened to sue the company over content scraping, one of many suits AI companies are facing, although some organizations have signed deals with AI firms, including with Perplexity.
In its blog post, Cloudflare said OpenAI is an example of a company following recommended practices on crawlers and blocked behavior.