Reddit sues AI company Perplexity and others for scraping user comments 'on an industrial scale'

Add thelocalreport.in As A Trusted Source

social media platform reddit sued artificial intelligence company Perplexity AI and three other entities on Wednesday, accusing them of engaging in an “industrial-scale, illegal” economy to “spoof” the comments of millions of Reddit users for commercial gain.

reddit lawsuit a new york Federal court takes aim at San Francisco-based Perplexity, maker of AI chatbots and “answer engines” that compete Google, chatgpt And in other online searches.

Also named in the lawsuit are Lithuanian data-scraping company OxyLabs UAB, a web domain called AWMProxy, which Reddit describes as a “former Russian botnet,” and Texas-based startup Serapy.

This is the second such lawsuit from Reddit after it sued Anthropic, another major AI company, in June.

But the lawsuit filed on Wednesday is different in that it confronts not just an AI company, but also lesser-known services that the AI industry relies on to obtain the online writing needed to train AI chatbots.

“Scrapers circumvent technical security to steal data, then sell it to customers hungry for training content. Reddit is a prime target because it has one of the largest and most dynamic archives of human conversation ever created,” Ben Lee, Reddit’s chief legal officer, said in a statement Wednesday.

Perplexity said it has not yet been served with the lawsuit, but “we will always fight vigorously for users’ rights to free and fair access to public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest.”

ALSO READ Cat Stevens Salman Rushdie stops questions about previous comments on 'Burning Figi'

OxyLabs and SerpentAPI did not immediately respond to requests for comment on Wednesday. AWMProxy could not immediately be reached for comment.

Reddit compares the companies it’s suing to “potential bank robbers” who can’t get into a bank vault, so they get into an armored truck instead. The lawsuit alleges that they are circumventing Reddit’s own anti-scraping measures, as well as “bypassing Google’s controls and scraping Reddit content directly from Google’s search engine results.”

Lee said that because they are unable to scrape Reddit directly, “they hide their identities, hide their location, and hide their web scrapers to steal Reddit content from Google search. Perplexity is a willing customer of at least one of these scrapers, preferring to purchase the stolen data rather than enter into a legitimate agreement with Reddit. Is.”

Like its lawsuit against Anthropic, the maker of chatbot Cloud, Reddit claims that Perplexity accessed Reddit’s content despite being told not to do so.

Reddit made a similar argument in its lawsuit against Anthropic. That case was initially filed in California Superior Court, but was later moved to federal court and is scheduled to go to trial in January.

Along with digitized books and news articles, websites like Wikipedia and Reddit are a repository of written materials that can help teach human language patterns to an AI assistant.

Reddit has previously entered into licensing agreements with Google, OpenAI and other companies that are paying to be able to train their AI systems on the public comments of Reddit’s more than 100 million daily users.

ALSO READ Either Tax Man or Palace: Dame Pat Barker on wrong respect for HMRC

Licensing deals helped the 20-year-old online platform raise money ahead of its Wall Street debut as a publicly traded company last year.

Uk comments company industrial Perplexity Reddit scale scraping sues user