The war for the web started

A high risk war has erupted on the future of the Internet. In a corner, there is Cloudflare, a giant of online infrastructure that works as a gateway to a large part of the online traffic. In the other, it is confused, my beloved artificial intelligence, a search engine threatening to provide Google’s hegemony.

The accusation is an explosive: Cloudflare is confused is a bad actor, a Ragge robot that ignores the oldest internet rules to unleash data secretly from web sites that she explicitly told to move away. Perplexity is completely fiery: it says that Cloudflare is either dangerous or involved in a propaganda trick, and a misunderstanding mainly how modern artificial intelligence works.

The dispute is the first major battle in a conflict that will determine the following era of the web: Who gets information on the Internet, and who will determine the rules?

The accusation: Robot Ragge in camouflage

For decades, the Internet’s work on the “Men Agreement” is called the Robots.txt file. It is a simple text file used by web owners to publish a digital mark “Do not enter” for web crawls or “robots”. I respect the high -trafficking robots, such as Google’s, this mark.

In cruelty Blog postCloudflare claims that confusion ignores it. The company claims that when its declared robot, “Perplexitybot”, the artificial intelligence search engine turns into a hidden mode, using the general browser identities and rotating IP addresses to continue crawling and collecting data in the case of camouflage.

Cloudflare says he has tested this by creating new private website with a “permissible No Bots” brand “No Bots”. Nevertheless, they found that “confusion was still providing detailed information regarding the exact content that was hosted in all of these restricted areas.” Based on this “ghost crawling behavior”, Cloudflare announced that it is now by inserting confusion in the list as a verified robot and prevents its unpopular crawling.

Rejection: “You do not understand how artificial intelligence works”

Confusion answer He was quickly, accused of Cloudflare of “almost everything is almost wrong about how modern artificial intelligence aides work.” The company argues that it is not a traditional “robot” and that Cloudflare misunderstood the old rules on new technology.

The essence of their argument is the difference between the robot and the user’s agent. Traditional robot, such as Google’s, crawls billions of pages systematically to create a huge index for later use. User agent, he claims confusion, works on behalf of a real person in an actual time. When you ask a question, the artificial intelligence agent brings the necessary information from the web at that moment to answer you. It is not storage data; He behaves as a personal researcher.

“This is a radical difference from the traditional crawl in which the crawls systematically visit millions of pages to create huge databases, whether anyone asked this specific information or not,” Al -Hira wrote in a detailed response. “When companies like Cloudflare offended to facilitate artificial intelligence aides that users drive as harmful robots, they argue that any automatic tool serves users should be suspicious-a position that criminalizes email customers and web browsers.”

Then the anti -bomb came. Perplexity Cloudflare claims “mainly a basis for daily requests from 3 to 6 meters” from a third -party cloud to be confused, describing it as “a failure to analyze the basic traffic that is particularly embarrassed for its basic business that understands and classifies web traffic.” Al -Hirah indicates that this is either a “smart propaganda moment” or a sign that Cloudflare “is seriously misleading on the basics of artificial intelligence.

Users have been divided on social media. “The confusion is just using an agent to bring something already on the general Internet, to answer the user’s question. Framing as a type of attack is ridiculous. The general Internet should be general.” Another user was more important: “The confusion, pretending to be a search engine, pretends to be Amnesty International, but no.”

Al -Huira is used just an agent to bring something already on the general web, to answer the user’s question.

Framing as a type of attack is ridiculous. The public network should be general.

Andrej (@0xdrej) August 4, 2025

Who owns the open web?

This general dispute places the central tension for the era of artificial intelligence. Emerging companies of artificial intelligence, such as confusion, needs access to the vast environment of the data on the open web network of work and competition with giants such as Google and Openai. Without it, they cannot provide accurate and accurate answers. But website owners grow increasingly from ignoring their content without approval or compensation to train these new models of artificial intelligence and operate them.

Cloudflare, by choosing to prevent the unpopular crawl, has effectively appointed it as an artificial intelligence data police, and to make decisions about what constitutes a “legitimate” traffic movement on the Internet. Confusion warns that this may lead to “two -level internet” as access to the user’s needs does not depend, but on whether the selected artificial intelligence tool “is blessed by the infrastructure control units.”

Internet rules are rewritten in actual time. The old man agreement collapses, and the battle began between the gate guards and innovators. The result will not only determine the future of artificial intelligence, but the future of the open network itself.

https://gizmodo.com/app/uploads/2024/06/65ba71e201ae86784a2c78b5d626c4e0.jpg

Source link