Reddit accused the artificial intelligence company of pulling user content from the platform without permission and using it to train Claude AI models. The lawsuit filed in California court claims that it has made more than 100,000 fraudulent requests to Reddit’s servers, even after publicly saying that humanity has been suspended.
The case is built around Reddit’s claim that humanity ignored both technical restrictions and its terms of use. According to the complaint, artificially bypassed protections like the site’s robots.txt file are supposed to prevent automatic reductions. Reddit also accusssssssssssssssss of violating user privacy by collecting and using personal posts (including deleted content) for commercial purposes.
Reddit said it provides structured access to data through licensing agreements with companies such as Openai and Google. These transactions include terms of content use, privacy safeguards and data deletion. According to the platform, humanity refused to pursue a formal contract, instead directly scraping the site, avoiding licensing fees, and skipping user protection in the process.
The lawsuit highlights a 2021 research paper co-authored by humanity CEO Dario Amodei, pointing to Reddit as a rich source of training data for language models. Reddit also includes examples where Claude appears to recreate Reddit posts in almost words, reflecting posts removed by users. The company says that humanity has failed to set up guardrails to respect user privacy and content takedowns.
Reddit seeks financial damages and court orders to cease use of Reddit content in future versions of the model.
Humanity responded, claiming it disagreed with its claim and plan that it protects itself. However, this is not the first time companies have been subject to legal pressure on how they collect training data.
In August 2024, a group of authors filed a class action lawsuit accusing humanity of using copyrighted work without permission. They alleged that the company trained the model with books and other written materials without consent and then requested compensation for the use of the content.
Similar cases have been involved in Universal Music Group and other publishers since October 2023. They sued mankind over allegations that the Claude Chatbot had recreated the lyrics of a copyrighted song. The music company claimed that the use violated intellectual property rights and asked the court to block further use of the lyrics.
Unlike these cases, Reddit cases do not focus on copyright. Instead, it focuses on breach of contract and unfair competition. Reddit’s claim is that data obtained from the site is not merely public, but is governed by conditions that humanity deliberately ignored. That distinction can be important for other platforms that host user content but control how it is used in commercial AI systems.
Reddit also accussed humanity of misleading the public. The lawsuit points to an official statement from humanity that claims that Reddit respects and respects reducing privacy for users who say it is inconsistent with the company’s actions.
“Even though the marketing materials say about some of it, humanity doesn’t care about Reddit rules or users,” the lawsuit reads. “We are considered to be eligible to take the necessary content and use it, but we are exempt from it even if we wish.”
Reddit’s shares rose nearly 67% after the lawsuit was filed. This is a sign that investors have supported the move. The outcome of the case could set precedents for how companies balance open Internet content with the rights of users and content owners.
With more AI companies relying on a large amount of online data, legal and ethical questions about scraping are becoming more difficult to ignore. The Reddit case will be added to the growing list of litigation that will shape how this next wave of AI development unfolds.
(Photo: Brett Jordan)
See: Automation Ethics: Addressing AI bias and compliance
Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber Security & Cloud Expo.
Check out other upcoming Enterprise Technology events and webinars with TechForge here.