Training AI On Mastodon Posts? The Idea's Extinct After Terms Updated

Mastodon is the latest platform to push back against AI training, updating its terms and conditions to ban the use of user content for large language models (LLMs).

"We want to make it clear," the federated platform stated in an email to users, "that training LLMs on the data of Mastodon users on our instances is not permitted."

The announcement may feel like shutting the stable door after the horse has bolted, but it's still reassuring to know that users' rants on the platform, in theory, won't feed into the LLMs behind generative AI services.

To be fair, enforcing such restrictions on a platform that prides itself on decentralization and openness could prove difficult. The terms apply only to Mastodon's own instances, not the wider Fediverse. It's possible to deploy a robots.txt file to block AI crawlers, but that relies on those behind the bots respecting it rather than invoking fair use.

Mastodon is not the only platform worried about its content being used for AI training. Another social media platform, Bluesky, recently said: "We do not use any of your content to train generative AI, and have no intention of doing so," but, as the service acknowledged, enforcement of such a rule outside its systems is challenging.

As 2024 drew to a close, a million public posts from Bluesky's firehose API turned up in a training set.

Earlier in June, discussion forum Reddit sued Anthropic, an AI business, over allegations [complaint is here – PDF] that content generated by its users was scraped in violation of contractual terms and technical barriers. The suit did not cite examples of any alleged robots.txt violations by Anthropic after July 2024.

In 2024, Reddit signed a data-sharing deal with OpenAI. Earlier that year, it signed an AI training deal with Google, having begun charging companies to use its data-downloading API in 2023.

Mastodon's change highlights the concerns of users over how their data might be used, particularly on platforms that are, by their nature, as free and open as possible.

The updates, including an increase in minimum age from 13 to 16, take effect from July 1. ®

RECENT NEWS

From Chip War To Cloud War: The Next Frontier In Global Tech Competition

The global chip war, characterized by intense competition among nations and corporations for supremacy in semiconductor ... Read more

The High Stakes Of Tech Regulation: Security Risks And Market Dynamics

The influence of tech giants in the global economy continues to grow, raising crucial questions about how to balance sec... Read more

The Tyranny Of Instagram Interiors: Why It's Time To Break Free From Algorithm-Driven Aesthetics

Instagram has become a dominant force in shaping interior design trends, offering a seemingly endless stream of inspirat... Read more

The Data Crunch In AI: Strategies For Sustainability

Exploring solutions to the imminent exhaustion of internet data for AI training.As the artificial intelligence (AI) indu... Read more

Google Abandons Four-Year Effort To Remove Cookies From Chrome Browser

After four years of dedicated effort, Google has decided to abandon its plan to remove third-party cookies from its Chro... Read more

LinkedIn Embraces AI And Gamification To Drive User Engagement And Revenue

In an effort to tackle slowing revenue growth and enhance user engagement, LinkedIn is turning to artificial intelligenc... Read more