Bad Girls, Good Guys, and Two-Fisted Action: Authors Guild

Showing posts with label Authors Guild. Show all posts

Thursday, September 11, 2025

[Link] What Authors Need to Know About the $1.5 Billion Anthropic Settlement

Today, Anthropic agreed to pay $1.5 billion to settle claims that it downloaded pirated books to train its AI systems—the largest U.S. copyright settlement in history. The parties in Bartz v Anthropic, one of the major copyright lawsuits brought by authors against an AI company for using pirated books to train its large language models, filed a proposed settlement agreement with the court that would settle the claims regarding the company’s mass piracy in downloading millions of books from notorious pirate sources Library Genesis (LibGen) and PiLiMi and then retaining them in a central library.

The settlement provides that Anthropic will pay $1.5 billion plus interest in cash into a settlement fund, representing the largest U.S. copyright infringement settlement ever and greater than any copyright damages award ever secured. The amount of the award sends a signal to all AI companies that downloading illegal copies of books to train AI comes with a heavy cost and, we expect, will foster further licensing, given the potential enormous liability AI companies risk when they help themselves to books for free from illegal channels.

“This historic settlement is a vital step in acknowledging that AI companies cannot simply steal authors’ creative work to build their AI just because they need books to develop quality LLMs,” said Authors Guild CEO Mary Rasenberger. “It is truly shocking that Anthropic and the other major LLM owners engaged in criminal-level piracy schemes to torrent millions of books knowingly from infamous foreign ebook piracy sites that the publishing industry has actively been trying to take down for years. Imagine the outrage if Anthropic and others had illegally siphoned off electricity to build their AI, claiming it was too expensive to pay for it? These vastly rich companies, worth billions, stole from those earning a median income of barely $20,000 a year. This settlement sends a clear message that AI companies must pay for the books they use just as they pay for the other essential components of their LLMs. This settlement lays down an anchor that it is not okay. We expect that the settlement will lead to more licensing that gives author both compensation and control over the use of their work by AI companies, as should be the case in a functioning free market society.”

Read the full article: https://authorsguild.org/news/what-authors-need-to-know-about-the-anthropic-settlement/

Saturday, October 21, 2023

[Link] Court Strikes Down Mandatory Deposit of Books for Library of Congress

People are often surprised to learn that there are two separate requirements under federal law related to submitting copies of works to the U.S. Copyright Office. First, when applying to register a copyright, copies must be submitted as part of the registration process. Second, the copyright owner of a work published in the United States is also generally required to submit one or more copies to the Copyright Office, even if they do not register the copyright. This separate requirement is known as “mandatory deposit.” While one set of copies can fulfill both the registration and mandatory deposit requirements, they are distinct legal obligations.

These copies are ultimately intended not for the Copyright Office, but for the Library of Congress, the Office’s parent agency. For decades, the Library has relied on mandatory deposit to maintain and grow its collections—currently estimated at over 175 million items. This past August, however, the future of this system was called into question when a federal appeals court held that the mandatory deposit requirement is unconstitutional as currently applied.

Small Publisher Challenges Deposit Demand

The lawsuit arose after the Copyright Office sent a “demand letter” to Valancourt Books, a small independent press that publishes rare and out-of-print fiction. The letter instructed Valancourt to deposit one complete copy of 341 of its books. Failure to do so, the Office explained, would make Valancourt liable for a fine of up to $250 per work and the total retail price of the copies demanded, as well as an additional fine of $2,500 for a willful and repeated failure to comply. Valancourt responded that it could not afford the cost of printing and shipping the books and requested that the Office withdraw its demand. After further discussion, the Office reduced its demand to 240 books but did not withdraw it altogether. Valancourt then filed suit, arguing that the mandatory deposit requirement violates the Constitution’s Takings Clause, which bars the government from appropriating property without just compensation.

Read the full article: https://authorsguild.org/news/court-strikes-down-mandatory-deposit-of-books-for-library-of-congress/

Saturday, September 30, 2023

[Link] You Just Found Out Your Book Was Used to Train AI. Now What?

This week, many authors discovered that their books were used without permission to train AI systems. Here’s what you need to know if your books are in the Books3 dataset, as well as actions you can take now to speak out in defense of your rights.

If you’re an author, you may have recently discovered that your published book was included in a dataset of books used to train artificial intelligence systems without your permission. (Search the dataset here.) This can be an unsettling revelation, raising concerns about copyright, compensation, and the future implications of AI. Here’s what you need to know if your work has been used to “train” AI without permission:

Books3 Is One of Several Books Datasets Used to Train AI Systems

The Books3 dataset contains 183,000 books, downloaded from pirate sources. We know that companies like Meta (creators of LLaMA), EleutherAI, and Bloomberg have used it to train their language models. OpenAI has not disclosed training information about GPT 3.5 or GPT 4—the models underlying ChatGPT—so we don’t know whether it also used Books3. Regardless of whether GPT was trained on Books3, the class action lawsuits against OpenAI should uncover more information on the datasets used by OpenAI, which we believe also include books obtained from pirate sources.

You Don’t Have to Be a Named Plaintiff in the Lawsuits to Benefit From the Outcome

In addition to the recent lawsuit in which the Authors Guild is a named plaintiff, there are other author class action suits pending against OpenAI, Meta, and Google. You don’t need to be a named plaintiff in any of these lawsuits to participate because the respective named plaintiffs represent their entire class. Even if you don’t fall within one or more classes, an outcome in favor of authors should benefit you by clarifying that books need to be licensed when used to “train” generative AI.

Read the full article: https://authorsguild.org/news/you-just-found-out-your-book-was-used-to-train-ai-now-what/

Take the Tour