Some people defend the confusion after the CLODFLERE is well named.

Photo of author

By [email protected]


When Cloudflare accused Artificial Intelligence search engine at a confusion Web sites scrape On Monday, with ignoring the location specified methods of banning it, this was not a clear case of Amnesty International’s web.

Many people came to defend confused. They argued that confusion in reaching sites that limit the desires of the site owner, although they are controversial, is acceptable. This is a controversy that will definitely grow with the dumping of artificial intelligence agents online: Should the agent be treated on a website on behalf of his user like a robot? Or like a person submits the same request?

Cloudflare is famous for providing fluid cupping services and other web safety services for millions of web sites. Basically, the Cloudflare test case included preparing a new web site with a new field that has never crawled by any robot, setting up the Robots.txt file that specifically prevented the known robots from AI, then requesting confusion about the site content. The confusion answered the question.

Cloudflare researchers found that the artificial intelligence search engine uses a “general browser aimed at impersonating Google Chrome on MacOS” when the same web creeping was banned. CEO of Cloudflare Matthew Princess to publish The search for X, writing, “is supposed to be some of the well -reputable intelligence companies like North Korean infiltrators. It is time to name, shame, and honest covering them.”

But many people did not agree to the prince’s evaluation that this was actually bad behavior. Those who defend confusion in the sites Like x and Hacker news He pointed out that what it seems that the Cloudflare was documenting it to reach a specific general website when he asked its user about this specific location.

“If I ask as a human website, I must display the content,” someone on Hacker news He wrote, adding, “Why will access to the website to access the site on my behalf in a different legal category such as my Firefox web browser?”

Spokesman previously It was rejected by Techcrunch that the robots were the company and called the Cloudflare Blog Publication of the stadium sales to Cloudflare. Then on Tuesday, confusion A blog was published In her defense (and attacking Cloudflare in general), claiming that behavior was from the third -party service that he used from time to time.

TECHRUNCH event

San Francisco
|
27-29 October, 2025

But the essence of Perplexity posted a similar attractiveness as the defenders did online.

The publication said: “The difference between the automatic crawl and the briefing the user is not just a technician-it is about who can access information on the open web.” “This controversy reveals that Cloudflare systems are mainly sufficient to distinguish between legal artificial intelligence assistants and actual threats.”

Peoplexity accusations are not completely fair, too. One of the arguments used by Prince and Cloudflare to summon Perplexity is that Openai does not act in the same way.

“Openai is an example of a leading AI company that follows best practices. It respects Robots.txt and does not try to evade either Robots.txt or a network lump. Chatgpt agent records HTTP requests using the newly proposed BOT OPT standard authentication,” The prince wrote his position.

Web Pot Approval It is a criterion supported by Cloudflare that is developed by the Internet Engineering Square that hopes to create an encryption method to determine AI Agent.

The discussion comes at a time when the BOT activity is restored. As mentioned by Techcrunch before, robots seek to scrape huge amounts of content to train artificial intelligence models It became a threatAnd especially for smaller sites.

For the first time in the history of the Internet, Robot activity is currently outperforming human activity via the InternetWith AI’s traffic account for more than 50 %, according to the IMPERVA report issued last month. Most of this activity comes from llms. But the report also found that harmful robots now make up 37 % of all online traffic. This is the activity that includes everything from constant scraping to unauthorized entry attempts.

Even LLMS, I generally accepted that the websites can prevent most BOT activity due to the number of times that were harmful using Captchas and other services (such as Cloudflare). Web sites also had a clear incentive to work with specific good representatives, such as Googlebot, and direct it to what was not indexed by Robots.txt. Google indexed the Internet, which sent traffic to the sites.

Now, LLMS takes an increasing amount of traffic. Gartner predicts This search engine size It will decrease by 25 % by 2026. Humans are now tending to click on the LLMS website links at the point that is more valuable for the web site, which is when they are ready to perform a transaction.

But if humans Adopting agents The technology industry also expects that it will – arrange our travel, reserve our dinner reservations, shop for us – will web sites hurt their commercial interests by preventing them? The discussion of X has completely acquired the dilemma:

“I want a confusion to visit any general content on my behalf when I give him a request/task!” books One person In response to Cloudflare call the confusion outside.

“What if the owners of the site do not want it? They just want (to visit directly) the house, see their purposes” Another argumentNoting that the owner of the site who created the content wants traffic and possible advertising revenues, and not to allow confusion to take it.

“This is why I cannot see” browsing agents “really works – a much more difficult problem than people think. Most of the owners of the sites will only prohibit,” Trick Fell.



https://techcrunch.com/wp-content/uploads/2024/10/45A2342_VGAEbHsG.jpg?resize=1200,800

Source link

Leave a Comment