Reddit restricts its available Wayback Archive

Photo of author

By [email protected]


The Internet Archive machine is the latest victim of Reddit suppressing data. The company has begun to put new restrictions on what the archive site will be able to access in a move that would significantly limit the ability of the Wayback machine to maintain information from Reddit.

With the change, the Wayback device, a project run by a non -profit internet archive, will not be able to crawl the Reddit home page. He will not be able to access comments, subreddit pages, post details, personal files and other data.

This step is the latest step taken by Radit Pursue To reduce the ability of artificial intelligence companies to use their data to train large language models without payment Licensing fees. It is also a significantly different position from the company that I took last year, when it explicitly said it would not limit the “goodwill actors”, ” Incurred Internet archive. It is not clear what exactly has changed since then. Reddit appears to believe that artificial intelligence companies circumvent their bases by canceling data via Wayback. We have contacted the Internet archive to comment.

Data license has become an important work for Reddit. The company made millions of dollars deals With openai and Google It allows them to use Reddit posts to help train artificial intelligence models. Meanwhile, Reddit has increasingly increasingly against companies trying to use their data without such arrangements. Earlier this year, the company A lawsuit against manHe claimed that he had scraped for years without permission.



https://s.yimg.com/ny/api/res/1.2/pf_gcZTHSTjgTP5ufq5rwQ–/YXBwaWQ9aGlnaGxhbmRlcjt3PTEyMDA7aD02NzU-/https://s.yimg.com/os/creatr-uploaded-images/2025-08/69546dd0-76be-11f0-bb74-951b13351d56

Source link

Leave a Comment