A federal judge in the US has made a ruling regarding the company Anthropic, recognizing that the use of legally purchased paper books for training AI models is “fair use” under US copyright law. Judge William Alsup ruled that digitizing purchased books and subsequently using them to train language models does not infringe on authors’ rights. The court specifically emphasized that this pertains only to physical copies that Anthropic purchased, disassembled, and scanned to create its own digital library.
The lawsuit was filed by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. They claimed that Anthropic used copyrighted works to train its models without permission. However, the court decided that training AI based on books legally purchased is not direct copying of works and does not mimic the style of specific authors. According to the judge, such practice is sufficiently distinct from the original use of the works and aligns with the goals of fostering creativity and scientific progress.
Despite this, Judge Alsup ordered a separate review regarding allegations of using millions of pirated books that Anthropic allegedly downloaded from the internet and stored in its library. This issue pertains to potential damages and is not covered by the current decision. The judge noted that purchasing a legal copy does not exempt the company from liability for previously downloading pirated content.