Home » News » Perplexity AI Caught Using Stealth Tactics to Bypa...

Perplexity AI Caught Using Stealth Tactics to Bypass Website Restrictions

Perplexity AI Caught Using Stealth Tactics to Bypass Website Restrictions

Add Techlomedia as a preferred source on Google. Preferred Source

Perplexity AI, a known player in the AI-powered search and answer engine space, is under fire for reportedly bypassing website restrictions through stealth crawling. A new report reveals that the company has been accessing data from websites that had explicitly blocked it, using undeclared user agents and IP address obfuscation techniques.

Perplexity, like other AI tools, depends on data from across the web to power its answers. So, it has bots to scrape websites. But instead of sticking to the standard rules that govern bots and crawlers, it took an unethical route.

Most websites use a file called robots.txt to tell bots what they are allowed to access. This system works on a basic rule of trust. Bots declare who they are and respect those instructions. But in this case, Perplexity seems to be sidestepping those boundaries.

When websites block their known bots, like PerplexityBot and Perplexity-User, the company reportedly switches to a more covert mode. It uses generic browser signatures, such as a fake Google Chrome user-agent string, and rotates IP addresses from different networks to get around these blocks. These stealth crawlers also ignore the robots.txt file entirely in many cases.

The tactics were uncovered through controlled tests. New websites were set up with strict no-crawling rules, and yet, Perplexity was still able to access their content and serve it in response to user queries. This only happened after its official bots were blocked. When AI tools like Perplexity break this balance, it creates a ripple effect.

Perplexity’s methods raise several red flags. By not identifying itself honestly, the company breaks one of the most basic principles of responsible crawling. Using stealth tactics to extract data from websites that explicitly disallowed access is a form of digital trespassing.

What makes this case even more concerning is how it compares to how other companies operate. OpenAI, for example, follows well-defined crawling practices. Its bots declare themselves properly, respect robots.txt files, and stop crawling when blocked. When tested under similar conditions, OpenAI’s ChatGPT agent behaved exactly as expected: it stopped when told to.

As AI tools become more powerful, the question of content access becomes more critical. Website owners deserve to decide how their data is used, especially when it comes to AI training and answer generation. Many platforms, especially those using services like Cloudflare, are now actively blocking AI bots that do not play by the rules.

Over 2.5 million websites have already opted out of AI training by updating their robots.txt files or using automated blocking tools. But if companies like Perplexity continue to use stealth methods, the arms race between crawlers and defenders will only escalate.

SOURCE: Cloudflare

Follow Techlomedia on Google News to stay updated. Follow on Google News

Affiliate Disclosure:

This article may contain affiliate links. We may earn a commission on purchases made through these links at no extra cost to you.

Deepanker Verma

About the Author: Deepanker Verma

Deepanker Verma is the Founder and Editor-in-Chief of TechloMedia. He holds Engineering degree in Computer Science and has over 15 years of experience in the technology sector. Deepanker bridges the gap between complex engineering and consumer electronics. He is also a a known Security Researcher acknowledged by global giants including Apple, Microsoft, and eBay. He uses his technical background to rigorously test gadgets, focusing on performance, security, and long-term value.

Related Posts

Stay Updated with Techlomedia

Join our newsletter to receive the latest tech news, reviews, and guides directly in your inbox.