Legal Intelligence. Trusted Insight.

Understand Your Rights. Solve Your Legal Problems

AI & Intellectual Property

Salesforce Sued for ‘Stolen Books’ in AI Copyright Lawsuit

Q: Is it legal to use copyrighted books to train AI models?

The legality depends on fair use — a doctrine that allows limited use of copyrighted material for transformative purposes. Courts must now decide whether teaching AI to generate new text counts as transformation or infringement.

Reading Time:

5

minutes

Posted: 17th October 2025

Susan Stein

Salesforce Sued for ‘Stolen Books’ in AI Copyright Lawsuit

In what could become a defining moment for AI copyright law, bestselling authors Molly Tanzer and Jennifer Gilmore have filed a class action lawsuit against Salesforce Inc., accusing the tech giant of secretly using thousands of copyrighted books to train its xGen AI models without consent or payment.

Filed in the Northern District of California in October 2025, the case Tanzer et al. v. Salesforce asks one provocative question now echoing across creative and legal circles:

When AI learns from your words, does that count as inspiration, or theft?

The complaint goes far beyond one company’s conduct. It challenges the very foundations of how modern generative AI systems are built, monetized, and justified under the legal shield of “fair use.”

And for Salesforce, a brand long associated with ethical innovation, the optics could be devastating.

How the Lawsuit Unfolded

The 46-page complaint alleges that Salesforce trained its xGen AI models on the so-called Book3 corpus, a massive dataset containing hundreds of thousands of novels, essays, and literary works scraped from the internet, many of them under active copyright.

According to the plaintiffs, these texts were downloaded, stored, and copied in full, forming the linguistic backbone of xGen’s capabilities.

Such acts, they argue, violate the exclusive reproduction rights granted to authors under Section 106 of the U.S. Copyright Act, while giving Salesforce an enormous commercial advantage over creators who received nothing.

Adding to the controversy, the suit highlights public statements by Salesforce’s CEO Marc Benioff, who previously condemned other AI firms for using “stolen data.”

That rhetorical reversal adds a powerful emotional undercurrent and makes this case as much about corporate credibility as copyright law.

The Legal Heart: Fair Use vs. Copyright Protection

To many observers, Tanzer v. Salesforce feels like a sequel to Authors Guild v. Google, the 2015 landmark that allowed Google to digitize books for its search index under the doctrine of transformative fair use.

But the similarities stop there.

What is a copyright lawsuit involving AI?

A copyright lawsuit involving artificial intelligence occurs when creators allege that an AI system used their protected works, such as books, music, or images without permission during model training.

These cases test whether machine learning qualifies as fair use under U.S. law or constitutes unauthorized copying of original content.

Where Google displayed only brief, non-substitutive snippets, Salesforce’s AI training allegedly ingested entire books, creating machine-learning weights that could be used to generate new text in similar style or tone.

The authors claim this process erases the line between study and reproduction, turning human creativity into raw machine fuel.

Salesforce, for its part, is expected to argue that:

Model training is transformative, producing data representations, not creative copies.
The process doesn’t compete with the original market, satisfying the fourth fair-use factor.
Limiting AI training would stifle innovation across industries relying on machine learning.

Understanding what courts mean by “transformative” is key here. As explored in Transformative Fair Use Explained: How to Legally Reuse Works in U.S. Copyright Law, the doctrine allows some reuse, but only when new meaning, message, or purpose is added.

The question now is whether teaching a machine to imitate writing styles qualifies.

Recent rulings such as Court Rules AI Cannot Be Copyrighted: Landmark Ruling on Human Authorship also underscore that copyright demands human input. The Salesforce case now tests the reverse—whether AI can legally consume human works without infringing them.

3. Regulation, Ethics, and the Coming AI Accountability Era

This lawsuit lands amid a broader regulatory awakening. Legislators in Washington are drafting bills that would:

Require transparency in AI training datasets,
Create licensing frameworks for copyrighted material, and
Establish royalty systems compensating creators for data use.

The U.S. Copyright Office is simultaneously reviewing whether AI training qualifies as “reproduction,” potentially setting a new legal threshold for compliance.

If courts act before lawmakers do, Tanzer v. Salesforce could set de facto national policy dictating how AI companies license data in the years ahead.

Salesforce’s case also carries a strong ethical dimension. Benioff’s vocal support for ethical capitalism and responsible tech use may amplify scrutiny.

In an era when investors and consumers value authenticity, perceived hypocrisy in AI ethics could become a reputational liability far greater than the lawsuit’s financial risk.

A Landmark Test for the Future of AI and Copyright

The plaintiffs seek class certification covering thousands of authors whose works were allegedly used in Salesforce’s datasets.

If granted, the financial exposure could reach hundreds of millions of dollars.

Discovery will likely reveal how Salesforce sourced its training data and whether internal discussions acknowledged copyright risks.

Beyond Salesforce, this lawsuit tests whether AI model training equals copying under U.S. law. A plaintiff victory could force developers to license creative content, spawning a new ecosystem for AI data rights management.

Conversely, a Salesforce win might cement fair use as a shield for large-scale training, leaving creators sidelined from the digital economy built on their words.

This debate isn’t confined to literature. Similar disputes are unfolding across industries, including film and design as seen in Disney & Universal vs. Midjourney: Inside the AI Copyright Battle That Could Rewrite Hollywood Law.

Together, these cases mark a global turning point for how law defines creativity in the age of algorithms.

Final Thought

The Tanzer v. Salesforce case goes beyond legal arguments, it’s part of a larger conversation about what creativity means in the age of machines.
If the authors win, it could mark the start of a new era where writers, artists, and creators are finally recognized and compensated for the value their work brings to artificial intelligence.

If Salesforce prevails, it may set a precedent that blurs the line between inspiration and imitation, raising uncomfortable questions about who truly owns creative expression in a digital world.

Whatever the outcome, the decision will ripple far beyond Silicon Valley, shaping how society balances innovation, ownership, and the human voice within AI’s expanding reach.

People Also Ask (PAA)

What is the Salesforce AI copyright lawsuit about?
The case involves authors accusing Salesforce of using their copyrighted books without permission to train its xGen AI model. They claim this violates the U.S. Copyright Act and undermines creative ownership in the age of artificial intelligence.

Why are authors suing Salesforce?
Writers Molly Tanzer and Jennifer Gilmore filed a class action alleging that Salesforce’s AI learned from pirated or unlicensed works. Their lawsuit seeks damages and stronger legal protection for creative content used in AI training.

Is it legal to use copyrighted books to train AI models?
The legality depends on fair use — a doctrine that allows limited use of copyrighted material for transformative purposes. Courts must now decide whether teaching AI to generate new text counts as transformation or infringement.

What could happen if Salesforce loses the lawsuit?
If the authors prevail, Salesforce may face major financial penalties and be forced to license copyrighted data. The decision could also set a national precedent requiring all AI developers to pay for the creative works they use.

How could this case impact future AI laws?
A ruling against Salesforce could shape how lawmakers regulate data transparency and copyright licensing in AI development. It may redefine fair use, forcing companies to rethink how they train large language models.

osgoodepd lawyermonthly 1100x100 oct2025

generic banners explore the internet 1500x300

JUST FOR YOU

AI-generated woman with purple hair standing in a crowded street protest, with “Remigration” signs visible in the background.

Amelia and AI Nostalgia: When Synthetic Media Becomes a Legal Problem

26th January 2026

$87.5 Million Tyson and Cargill Beef Antitrust Settlement Exposes the Price of Market Power

9th January 2026

Amazon Faces Renewed Liability Risk as Price-Gouging Litigation Clears Judicial Threshold

8th January 2026

AT&T settlement claims deadline set for Thursday, Dec. 18

4th December 2025

JUST FOR YOU

Amelia and AI Nostalgia: When Synthetic Media Becomes a Legal Problem

26th January 2026

$87.5 Million Tyson and Cargill Beef Antitrust Settlement Exposes the Price of Market Power

9th January 2026

Amazon Faces Renewed Liability Risk as Price-Gouging Litigation Clears Judicial Threshold

8th January 2026

AT&T settlement claims deadline set for Thursday, Dec. 18

4th December 2025

Sign up to our newsletter for the latest AI & Legal Research Tools Updates

Subscribe to Lawyer Monthly Magazine Today to receive all of the latest news from the world of Law.

About the Author

Susan Stein

Susan Stein is a legal contributor at Lawyer Monthly, covering issues at the intersection of family law, consumer protection, employment rights, personal injury, immigration, and criminal defense. Since 2015, she has written extensively about how legal reforms and real-world cases shape everyday justice for individuals and families. Susan’s work focuses on making complex legal processes understandable, offering practical insights into rights, procedures, and emerging trends within U.S. and international law.

More information

Connect with LM

About Lawyer Monthly

Legal Intelligence. Trusted Insight. Since 2009

Salesforce Sued for ‘Stolen Books’ in AI Copyright Lawsuit

JUST FOR YOU

TRENDING

JUST FOR YOU

About the Author

About Lawyer Monthly

Follow Lawyer Monthly