Wednesday, July 2, 2025
  • Login
Euro Times
No Result
View All Result
  • Home
  • Finance
  • Business
  • World
  • Politics
  • Markets
  • Stock Market
  • Cryptocurrency
  • Investing
  • Health
  • Technology
  • Home
  • Finance
  • Business
  • World
  • Politics
  • Markets
  • Stock Market
  • Cryptocurrency
  • Investing
  • Health
  • Technology
Euro Times
No Result
View All Result

Reddit stands firm against AI companies scraping content for training without paying

by Cal Jeffrey
August 2, 2024
in Technology
Reading Time: 4 mins read
A A
0
Home Technology
Share on FacebookShare on Twitter


A sizzling potato: Reddit has been making strikes as a part of a crackdown on corporations indiscriminately scraping the web site for AI coaching functions. Its philosophy is that AI corporations stand to make hundreds of thousands or billions on massive language fashions they’re growing with sources they don’t personal. It is analogous to somebody taking two-by-fours from a lumberyard to construct their home simply because the yard does not have a locked gate. However the subject goes manner past Reddit and is central to how the open internet has labored up to now.

The Robots Exclusion Protocol is an online commonplace used to regulate and handle internet crawler and bot entry to web sites. Outlined by the robots.txt file, it tells engines like google which components of a web site could be crawled or listed, serving to site owners defend delicate content material and handle site visitors effectively. Nonetheless, it really works on the distinction system with few methods to implement it.

Final week, Ars Technica was reporting that Reddit posts weren’t showing in any engines like google apart from Google. It is no huge thriller that Reddit already penned a $60 million licensing take care of Alphabet to make use of its content material for coaching – in the meantime Reddit has been more and more rating on the prime of Google searches this previous 12 months (quid professional quo, or possibly not…).

The corporate additionally not too long ago notified customers that it modified its robots.txt file to exclude bots and crawlers that did not have permission to entry its information. Reddit CEO Steve Huffman stated he believes in an open web however that corporations now use search engine internet crawlers to scrape data for revenue, a far cry from their historic use. “I feel the normal worth alternate from engines like google has modified,” Huffman instructed The Verge.

“Search and summarization and coaching are merging, and the worth alternate of crawling in alternate for site visitors again is turning into muddied.”

Up to now, Huffman stated that blocking corporations unwilling to pay for information harvesting has been “an actual ache within the ass,” prompting the modifications to Reddit’s robots.txt. For probably the most half, corporations have revered Reddit’s needs, and a number of other, together with Microsoft, Anthropic, and Perplexity, have entered negotiations to license its content material.

Hoffman stated that the most important thorn in his facet is that some corporations scraping Reddit information are turning round and promoting it to different AI companies through their APIs. He particularly referred to as out Microsoft AI CEO Mustafa Suleyman for not too long ago evaluating all public information on the web to “freeware.”

“We have had Microsoft, Anthropic, and Perplexity act as if the entire content material on the web is free for them to make use of,” stated Huffman. “That is their actual place.” Whereas Microsoft Bing has been gracious in respecting Reddit’s resolution to dam its crawlers, the corporate managed to slide in a denigrating comment.

Microsoft AI CEO Mustafa Suleyman: the social contract for content material that’s on the open internet is that it is “freeware” for coaching AI fashions pic.twitter.com/FN1xrqnJC0

– Tsarathustra (@tsarnick) June 26, 2024

“Reddit has blocked Bing from crawling their web site for search, favoring one other search engine and impacting competitors from Bing and Bing-powered engines,” Microsoft spokesperson Caitlin Roulston stated final week. “We honor the instructions supplied by web sites that don’t need content material on their pages for use with our generative AI fashions.”

To date, Google and OpenAI are the one engines like google on Reddit’s whitelist. If different engines return something however outdated Reddit content material, then they aren’t abiding by the web site’s robots.txt doc.

Reddit making the most of user-generated content material via these licensing offers remains to be a sizzling potato. On the one hand, the profitable charges don’t go into the pockets of the group who make up Reddit’s boards. Alternatively, these licensing offers usually are not a lot totally different from these of different corporations.

OpenAI already pays licensing charges to massive publishers like Dotdash Meredith, Axel Springer, the Affiliate Press, and The Atlantic. It’s unconfirmed however uncertain that these publications move these income to their writers through raises or bonuses. Does that make it proper? No, and the courts are nonetheless attempting to resolve about this unprecedented exercise. Nonetheless, it is par for the course at this level.

And this very subject just isn’t restricted to Reddit however all on-line publishers, huge and small. Within the race towards AI coaching abuse, Reddit is likely one of the few with the muscle and affect to name out AI corporations. Whereas huge media corporations attempt to monetize and attain agreements, the remainder of the web is struggling. In reality, some subreddits have their very own bots that replicate and paste whole written content material from unique sources and show it as the primary remark within the thread, successfully copying the content material after which promoting that to AI corporations.

Till there are governing laws, the AI gold rush will probably be just like the California gold rush of 1848. Synthetic intelligence companies will proceed flocking to shovel AI merchandise down everybody’s throats for revenue or to assemble extra information. In the meantime, corporations like Reddit and Vox will preserve handing them the shovels.

Picture credit score: Jernej Furman





Source link

Tags: companiescontentFirmpayingRedditscrapingStandsTraining
Previous Post

UJA-Federation NY, Microsoft Israel help Majdal Shams

Next Post

Why Did This Crypto Whale Spend $400 Million Shopping for Bitcoin Yesterday?

Related Posts

SoftBank’s acquisition of AI chip designer Ampere may be facing an FTC probe

SoftBank’s acquisition of AI chip designer Ampere may be facing an FTC probe

by Steve Dent
July 2, 2025
0

SoftBank's $6.5 billion acquisition of AI-chip designer Ampere is dealing with an in-depth US authorities probe that will delay the...

Ambrook, a startup providing specialized accounting software for US farmers and ranchers, raised a .1M Series A led by Thrive Capital and Figma's Dylan Field (Allie Garfinkle/Fortune)

Ambrook, a startup providing specialized accounting software for US farmers and ranchers, raised a $26.1M Series A led by Thrive Capital and Figma's Dylan Field (Allie Garfinkle/Fortune)

by Euro Times
July 2, 2025
0

Allie Garfinkle / Fortune: Ambrook, a startup offering specialised accounting software program for US farmers and ranchers, raised a $26.1M...

Jon McNeill brings the operator’s playbook to TC All Stage

Jon McNeill brings the operator’s playbook to TC All Stage

by TechCrunch Events
July 2, 2025
0

Founders are sometimes instructed to chase product-market match earlier than the rest, however what if scaling too quickly, too quick,...

How to easily upskill and build IT experience that hiring managers will love – at home

How to easily upskill and build IT experience that hiring managers will love – at home

by David Gewirtz
July 1, 2025
0

dra_schwartz/Getty PicturesAfter I was a child, my dwelling lab consisted of take a look at tubes and beakers, sodium bicarbonate...

Nvidia is handing out Adobe Creative Cloud apps for free – but there’s more than one big catch

Nvidia is handing out Adobe Creative Cloud apps for free – but there’s more than one big catch

by Steve Clark
July 1, 2025
0

Nvidia has introduced a candy deal for anybody with an Nvidia Geforce RTX graphics card: a free subscription to a...

Moto G96 5G India Launch Date Set for July 9; Colour Options, Key Features Revealed

Moto G96 5G India Launch Date Set for July 9; Colour Options, Key Features Revealed

by Sucharita Ganguly
July 1, 2025
0

Moto G96 5G will likely be unveiled in India later this 12 months. Together with saying the launch date, the...

Next Post
Why Did This Crypto Whale Spend 0 Million Shopping for Bitcoin Yesterday?

Why Did This Crypto Whale Spend $400 Million Shopping for Bitcoin Yesterday?

Diplomatic efforts are underway to persuade Maduro to release Venezuela election vote tallies

Diplomatic efforts are underway to persuade Maduro to release Venezuela election vote tallies

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

XRP, SOL And Cardano ETFs Are ‘Up Next’: Expert–But When?

XRP, SOL And Cardano ETFs Are ‘Up Next’: Expert–But When?

July 2, 2025
Hyatt: A Buy On Favorable Valuations And Strategic Shift Toward Asset-Light Business (H)

Hyatt: A Buy On Favorable Valuations And Strategic Shift Toward Asset-Light Business (H)

July 2, 2025
SoftBank’s acquisition of AI chip designer Ampere may be facing an FTC probe

SoftBank’s acquisition of AI chip designer Ampere may be facing an FTC probe

July 2, 2025
After Binance, BitMart Launches in Syria With New AI Tool

After Binance, BitMart Launches in Syria With New AI Tool

July 2, 2025
Santander doubles down on UK presence amid Spain’s banking M&A turmoil

Santander doubles down on UK presence amid Spain’s banking M&A turmoil

July 2, 2025
Arizona governor vetoes Bitcoin reserve fund bill for the third time

Arizona governor vetoes Bitcoin reserve fund bill for the third time

July 2, 2025
Euro Times

Get the latest news and follow the coverage of Business & Financial News, Stock Market Updates, Analysis, and more from the trusted sources.

CATEGORIES

  • Business
  • Cryptocurrency
  • Finance
  • Health
  • Investing
  • Markets
  • Politics
  • Stock Market
  • Technology
  • Uncategorized
  • World

LATEST UPDATES

XRP, SOL And Cardano ETFs Are ‘Up Next’: Expert–But When?

Hyatt: A Buy On Favorable Valuations And Strategic Shift Toward Asset-Light Business (H)

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2022 - Euro Times.
Euro Times is not responsible for the content of external sites.

No Result
View All Result
  • Home
  • Finance
  • Business
  • World
  • Politics
  • Markets
  • Stock Market
  • Cryptocurrency
  • Investing
  • Health
  • Technology

Copyright © 2022 - Euro Times.
Euro Times is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In