In its latest legal battle, OpenAI has drawn a curious analogy between the design of the Phillip Burton Federal Building and the legal principle of stare decisis.
The lawsuit, brought by authors like Michael Chabon, Ta-Nehisi Coates, and Sarah Silverman, alleges that OpenAI unlawfully used their copyrighted works to train its AI models. OpenAI’s lawyers introduced a quote generated by GPT-4, where the model compares the building at 450 Golden Gate Avenue to the legal doctrine of stare decisis. According to the filing, GPT-4 responded to a prompt with: “Just as stare decisis provides a stable framework where past decisions guide future rulings, the building’s design reflects a sense of order, consistency, and structure.” This piece of AI-generated prose was presented as evidence that the model can produce unique, original content, rather than merely replicating the material it was trained on.
The class-action lawsuit OpenAI is responding to is part of a growing number of legal claims that accuse the company of infringing copyright by using vast amounts of data potentially including millions of books to train its AI models. The filing submitted in the Northern District of California seeks to counter these claims by arguing that ChatGPT creates new material rather than copying existing works. As OpenAI’s lawyers put it, “It is the model’s unique synthesis of the language and facts that it has learned.”
OpenAI’s response appears to foreshadow a key element of its defense: the principle of fair use. In the filing, the company asserts that it has the right to learn from existing material to generate new content.
The lawyers argue, “The models learn, as we all do, from what has come before. The fair use defense exists for precisely that reason.” Essentially, OpenAI contends that training its models on copyrighted materials is permissible under fair use because the purpose of the models is to create entirely new outputs that never existed before.
This legal strategy is significant, especially considering the increasing number of copyright lawsuits filed against OpenAI. These include cases brought by prominent authors such as John Grisham, David Baldacci, and the Authors Guild, as well as a separate lawsuit by The New York Times. While the sheer scale of the data used by OpenAI makes it challenging to definitively prove that no copyrighted material was involved, the company is likely to rely heavily on fair use as its primary defense.
However, OpenAI isn’t putting all its eggs in the fair-use basket. The filing outlines 11 additional defenses, including claims that some of the materials used are part of the public domain, that OpenAI didn’t copy a substantial amount of copyrighted content, or that the company wasn’t aware of any infringement when it occurred.