Apple refines Liquid Glass, boosts animation speed in beta 6
Getting Data
Loading...

Why is Reddit shutting out the Wayback Machine now?

Reddit has blocked the Wayback Machine from archiving most of its site to prevent unlicensed AI scraping, prioritizing data licensing over web preservation.

AvatarJR

By Jace Reed

2 min read

Why is Reddit shutting out the Wayback Machine now?

Reddit is taking a firm stand against what it calls unauthorized AI data scraping by limiting the Internet Archive’s Wayback Machine to only store its homepage. Beginning August 11, 2025, most post and comment data is now blocked from archival.

The company says the move is directly tied to protecting its growing data licensing business, which now generates about 10% of its total revenue through high-profile deals with Google and OpenAI.

AI-driven data protection

A Reddit spokesperson said the block will remain until the Internet Archive can enforce stricter anti-scraping measures, including removing deleted content and honoring platform policies. Executives have accused AI developers of bypassing licensing deals by scraping from archived pages.

This stance follows March 2023 API changes that made third-party access prohibitively expensive, signaling a long-term strategy to control who uses Reddit’s user-generated content.

Did you know?
The Internet Archive’s Wayback Machine has indexed over 866 billion web pages since its launch in 2001, making it the largest library of archived websites in history.

Licensing is big business

Reddit’s licensing arrangements are lucrative: a $60 million annual agreement with Google and a $70 million deal with OpenAI, together worth around $130 million per year. These AI training partnerships have transformed data access into a significant new revenue channel.

By restricting archival, Reddit can limit free external access and funnel more negotiations toward paid deals, strengthening its control in an AI-hungry market.

ALSO READ | How will GitHub’s org change affect developers day to day?

Reddit has already filed a lawsuit against AI startup Anthropic, accusing it of scraping more than 100,000 posts without permission, including deleted materials. The company is leaning on contractual violations instead of copyright claims to push its case.

These proactive measures reflect a growing trend among social platforms to set firm ground rules for AI companies looking to harvest their content.

Impact on web preservation

For researchers, historians, and journalists, the block is a blow to the long-term preservation of online discourse. Future Wayback Machine captures will show trending headlines but not the community discussions that defined Reddit’s cultural footprint.

The Internet Archive has yet to comment, but the standoff underscores a broader tension between monetizing data and maintaining an open record of internet history. Whether Reddit will restore access depends on its compliance with privacy and anti-scraping requirements.

Do you think Reddit should restore full Wayback Machine access?

Total votes: 580

(0)

Please sign in to leave a comment

Related Articles

MoneyOval

MoneyOval is a global media company delivering insights at the intersection of finance, business, technology, and innovation. From boardroom decisions to blockchain trends, MoneyOval provides clarity and context to the forces driving today’s economic landscape.

© 2025 MoneyOval.
All rights reserved.