Byte the Bullet: NYT Takes on Tech Titans in a Copyright Showdown Over AI

Photo by Conny Schneider on Unsplash.

In a seismic collision of technology and journalism, December 2023 witnessed The New York Times (NYT) thrusting the issue of artificial intelligence (AI) into legal and ethical crossfires. The NYT lodged a lawsuit against tech giants and AI-industry leaders Microsoft and OpenAI, pinpointing their ChatGPT language model for alleged copyright infringement. The charge is weighty: ChatGPT purportedly harnessed millions of NYT articles, even those shielded behind paywalls, to mold its language learning capabilities. 

The sought-after damages run into the billions, but beyond the financial stakes, it is the groundbreaking copyright law implications that promise to redefine the relationship between media giants and AI. 

This article explores the critical dimensions of the lawsuit, placing it within the broader context of AI language models. By doing so, it aims to unravel the potential reverberations of the suit for grassroots journalism and independent media in a landscape increasingly shaped by algorithms and machine learning.

AI Behind the Scenes: Understanding ChatGPT's Training

AI, like ChatGPT, learns by being exposed to vast amounts of data, such as internet text. It discerns patterns and grasps language intricacies through this exposure, allowing it to understand context, expressions, and diverse nuances. Once trained, AI utilizes the knowledge gained to generate responses, answer queries, or perform specific tasks. It applies the patterns learned from the training data to comprehend and generate human-like text. 

This landmark case could become a guiding precedent, shaping the intricate interplay between media institutions, AI developers, and the realm of independent journalism. The ripple effect may transcend legalities, imprinting a direction for future engagements in the ever-evolving landscape of information dissemination.

In order to accomplish this learning, AI systematically “scrapes” (extracts) data from online sources. In the case of ChatGPT, this involves collecting information from various websites to create an extensive training dataset. This scraping, essentially a colossal form of data mining, is often performed without explicit permission from content creators. This legality quandary prompts news organizations to negotiate settlements with OpenAI, wherein licenses are granted for specific material scraping. 

Rather than opting for a settlement, however, the NYT, historically recognized for its willingness to pursue legal action, opted to file a lawsuit against OpenAI, alleging that Open AI has scraped data from the NYT without explicit authorization. 

This move not only raises questions about the legality of scraping but also delves into the broader ethical dimensions of training AI systems. It marks a significant juncture in the evolution of artificial intelligence, aligning with the NYT's track record of contributing to major legal precedents (see NYT Co. v United States, NYT v Sullivan or NYT Co. v Tasini) .

Legal Landscape: Unpacking the NYT Lawsuit

In late December 2023, the NYT filed the lawsuit with the district court of the southern district of New York. However, the trajectory of this legal showdown points toward an inevitable escalation to the Supreme Court. Under federal copyright law, each willful violation of copyright allows for damages of up to $150,000. With the potential scale of alleged copyright violations, the damages in question could reach staggering amounts. 

Central to the legal dispute is the legal interpretation of the fair use doctrine, which allows for the use of copyrighted material under certain circumstances, such as for criticism, commentary, news reporting, teaching, scholarship, or research. Did the AI models, including ChatGPT, illegally copy content from sources like the NYT to train their models? Or were these new systems engaging in a “fair use” of the online content? 

When this case reaches the Supreme Court, it's anticipated to be influenced by two major legal precedents. The first involves a 2015 federal appeals court ruling that found Google's digital scanning of millions of books for its Google Books library constituted fair use. The court held that Google's use did not compete with the original works and was not a suitable market substitute for the books themselves. On the other hand, a May 2023 decision involving the late artists Andy Warhol and Prince ruled against Warhol, asserting that his alterations of a photograph for commercial use were not protected by fair use. The argument here was that the copied work served a highly similar purpose to the original, risking market substitution.

The outcome of the NYT lawsuit could profoundly alter not just how content is created and consumed but also the trajectory of AI development. If the court rules in favor of the NYT, it could order the destruction of the infringing articles, potentially requiring AI technologies to rebuild their datasets from scratch, using entirely original or licensed works. The implications extend beyond financial penalties, reaching the core of how AI technologies operate and how they will evolve in the years to come.

As this legal saga unfolds, it has the potential to reshape the intersection of intellectual property, technology, journalism, and the law. The stakes are high, not only for the parties directly involved but for the broader landscape of content creation, consumption, and the future of artificial intelligence.

Philosophical Insights: Traversing the AI Landscape

In the intricate dance between artificial intelligence and journalism, the theories of Michel Foucault and Theodor Adorno illuminate the nuanced challenges we face. Foucault's lens exposes the power dynamics embedded in AI's integration into journalism. As algorithms curate and disseminate information, a subtle shift in control emerges. The potential impact on media diversity becomes a concern as AI, wielded as a powerful tool, shapes narratives and influences the information landscape. However, amid these considerations, it is crucial to acknowledge the looming prospect that AI may transcend its role as a tool and start to act autonomously. This raises additional questions about the extent to which unguided artificial intelligence could independently shape the evolving landscape of journalism.

Parallel to this, Adorno's post-WWII cultural critique resonates in the AI-driven media realm. The risk of cultural homogenization, which Adorno identified as a product of the power of big Hollywood studios and other corporate giants, looms again as algorithms prioritize popular content, potentially eroding the diverse fabric of perspectives cherished by independent journalism.

In contemplating the court's decision on the NYT lawsuit against OpenAI, we peer into a future where legal outcomes extend beyond the courtroom. This landmark case could become a guiding precedent, shaping the intricate interplay between media institutions, AI developers, and the realm of independent journalism. The ripple effect may transcend legalities, imprinting a direction for future engagements in the ever-evolving landscape of information dissemination.

As we stand at this intersection of technological advancement and ethical considerations, a nuanced perspective is essential. Progress in AI must harmonize with the ethical foundations of journalism, ensuring that innovation serves the principles of diversity, democracy, integrity, and the pursuit of truth in independent media.

AI in Grassroots Journalism: Navigating Challenges

In the realm of grassroots journalism, where news is produced by ordinary people often on the front lines of struggles for social change, the integration of AI unveils a double-edged sword, ushering in both promises and perils. As small, independent media outlets engage with the ever-evolving landscape of AI, the implications stretch beyond legal battles into the very heart of their mission.

The implications of this lawsuit extend far beyond major news outlets like the NYT; they also resonate deeply with small grassroots journalism publications such as Weave News, underscoring the shared challenges and importance of navigating the evolving landscape of AI in the pursuit of unbiased, diverse, and independent reporting.

For example, the financial strain posed by copyright lawsuits and legal skirmishes becomes a weighty burden for outlets with limited resources. For grassroots journalism, where every penny counts, the specter of such legal entanglements raises concerns about survival and sustainability.

The impact of AI on the uniqueness of independent media is also a critical consideration. As algorithms shape content creation, curation, and consumption, the risk of homogenization raises alarm bells for the diverse perspectives, critical thinking, and unique narratives that grassroots journalism cherishes. The richness and political power derived from diverse voices, particularly those representing marginalized communities speaking and acting in solidarity, may be at risk in an AI-driven environment.

The true strength of independent journalism lies in its ability to amplify grassroots perspectives, challenge the status quo, and give voice to the unheard. As AI becomes a fixture in this arena, preserving the essence of grassroots journalism demands a delicate balance. It necessitates not only legal resilience but also a steadfast commitment to safeguarding the diverse voices that form the backbone of independent media.

The AI Frontier: Charting New Paths for Journalism

Photo by Mojahid Mottakin on Unsplash. 

Herein lies the crux of the matter. The implications of this lawsuit extend far beyond major news outlets like the NYT; they also resonate deeply with small grassroots journalism publications such as Weave News, underscoring the shared challenges and importance of navigating the evolving landscape of AI in the pursuit of unbiased, diverse, and independent reporting.

The shared challenges are palpable — from the financial strain of legal battles to the subtle erosion of unique voices under the influence of AI-driven standardization. As we navigate these emerging realities, the lawsuit becomes not just a legal discourse but a collective call to action. It underscores the shared responsibility of all media entities, irrespective of size, to defend the integrity of journalism itself.

In the wake of this legal turbulence, the soul of journalism, especially in grassroots endeavors, beckons for resilience and adaptability. The evolving AI terrain is not just a battleground; it's a canvas where the strokes of innovation must be tempered with the brush of ethical considerations. As we contemplate the intersection of AI and journalism, let it be a collective endeavor to preserve the vibrant mosaic of voices, essential for a thriving, informed society. The journey ahead is one of shared challenges, but also shared victories in upholding the core tenets of media in the face of technological evolution.

Celine Schreiber

Celine Schreiber (she/they) is the Communications Director at Weave News. She is currently a Ph.D. Student at the University of Leipzig Department of Health Economics and Management. Celine holds a master’s degree in medicine, ethics, and law from the University of Halle, Germany, and earned her B.A. in Anthropology and Government from St. Lawrence University in 2020. In her current role at Weave News she oversees the Communications Team, focusing especially on building a meaningful online media experience for readers, donors, and affiliated organizations.

Previous
Previous

Something You Might Not Know About: Blackstone’s Champlain Hudson Power Express

Next
Next

Interweaving With Vonetta T. Rhodes: “Child care is a civil right and a public good”