Tagged / Chat GPT

Conversation article: How a New York Times copyright lawsuit against OpenAI could potentially transform how AI and copyright work

Professor Dinusha Mendis writes for The Conversation about the potential copyright implications of AI as a lawsuit is lodged by the New York Times against the creator of ChatGPT…

How a New York Times copyright lawsuit against OpenAI could potentially transform how AI and copyright work

Stas Malyarevsky / Shutterstock

Dinusha Mendis, Bournemouth University

On December 27, 2023, the New York Times (NYT) filed a lawsuit in the Federal
District Court in Manhattan against Microsoft and OpenAI, the creator of ChatGPT,
alleging that OpenAI had unlawfully used its articles to create artificial intelligence (AI) products.

Citing copyright infringement and the importance of independent journalism to democracy, the newspaper further alleged that even though the defendant, OpenAI, may have “engaged in wide scale copying from many sources, they gave Times content particular emphasis” in training generative artificial intelligence (GenAI) tools such as Generative Pre-Trained Transformers (GPT). This is the kind of technology that underlies products such as the AI chatbot ChatGPT.

The complaint by the New York Times states that OpenAI took millions of copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides and more in an attempt to “free ride on the Times’s massive investment in its journalism”.

In a blog post published by OpenAI on January 8, 2024, the tech company responded to the allegations by emphasising its support of journalism and partnerships with news organisations. It went on to say that the “NYT lawsuit is without merit”.

In the months prior to the complaint being lodged by the New York Times, OpenAI had entered into agreements with large media companies such as Axel-Springer and the Associated Press, although notably, the Times failed to reach an agreement with the tech company.

The NYT case is important because it is different to other cases involving AI and copyright, such as the case brought by the online photo library Getty Images against the tech company Stability AI earlier in 2023. In this case, Getty Images alleged that Stability AI processed millions of copyrighted images using a tool called Stable Diffusion, which generates images from text prompts using AI.

The main difference between this case and the New York Times one is that the newspaper’s complaint highlighted actual outputs used by OpenAI to train its AI tools. The Times provided examples of articles that were reproduced almost verbatim.

Use of material

The defence available to OpenAI is “fair use” under the US Copyright Act 1976, section 107. This is because the unlicensed use of copyright material to train generative AI models can serve as a “transformative use” which changes the original material. However, the complaint from the New York Times also says that their chatbots bypassed the newspaper’s paywalls to create summaries of articles.

Even though summaries do not infringe copyright, their use could be used by the New York Times to try to demonstrate a negative commercial impact on the newspaper – challenging the fair use defence.

ChatGPT
Giulio Benzin / Shutterstock

This case could ultimately be settled out of court. It is also possible that the Times’ lawsuit was more a negotiating tactic than a real attempt to go all the way to trial. Whichever way the case proceeds, it could have important implications for both traditional media and AI development.

It also raises the question of the suitability of current copyright laws to deal with AI. In a submission to the House of Lords communications and digital select committee on December 5, 2023, OpenAI claimed that “it would be impossible to train today’s leading AI models without copyrighted materials”.

It went on to say that “limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment but would not provide AI systems that meet the needs of today’s citizens”.

Looking for answers

The EU’s AI Act –- the world’s first AI Act –- might give us insights into some future directions. Among its many articles, there are two provisions particularly relevant to copyright.

The first provision titled, “Obligations for providers of general-purpose AI
models” includes two distinct requirements related to copyright. Section 1(C)
requires providers of general-purpose AI models to put in place a policy to respect EU copyright law.

Section 1(d) requires providers of general purpose AI systems to draw up and make publicly available a detailed summary about content used for training AI systems.

While section 1(d) raises some questions, section 1(c) makes it clear that any use of copyright protected content requires the authorisation of the rights holder concerned unless relevant copyright exceptions apply. Where the rights to opt out has been expressly reserved in an appropriate manner, providers of general purpose AI models, such as OpenAI, will need to obtain authorisation from rights holders if they want to carry out text and data mining on their copyrighted works.

Even though the EU AI Act may not be directly relevant to the New York Times complaint against OpenAI, it illustrates the way in which copyright laws will be designed to deal with this fast-moving technology. In future, we are likely to see more media organisations adopting this law to protect journalism and creativity. In fact, even before the EU AI Act was passed, the New York Times blocked OpenAI from trawling its content. The Guardian followed suit in September 2023 – as did many others.

However, the move did not allow material to be removed from existing training
data sets. Therefore, any copyrighted material used by the training models up until then would have been used in OpenAI’s outputs –- which led to negotiations between the New York Times and OpenAI breaking down.

With laws such as those in the EU AI Act now placing legal obligations on general purpose AI models, their future could look more constrained in the way that they use copyrighted works to train and improve their systems. We can expect other jurisdictions to update their copyright laws reflecting similar provisions to that of the EU AI Act in an attempt to protect creativity. As for traditional media, ever since the rise of the internet and social media, news outlets have been challenged in drawing readers to their sites and generative AI has simply exacerbated this issue.

This case will not spell the end of generative AI or copyright. However, it certainly raises questions for the future of AI innovation and the protection of creative content. AI will certainly continue to grow and develop and we will continue to see and experience its many benefits. However, the time has come for policymakers to take serious note of these AI developments and update copyright laws, protecting creators in the process.The Conversation

Dinusha Mendis, Professor of Intellectual Property and Innovation Law; Director Centre for Intellectual Property Policy and Managament (CIPPM), Bournemouth University, Bournemouth University

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Conversation article: ChatGPT isn’t the death of homework – just an opportunity for schools to do things differently

Professor Andy Phippen writes for The Conversation about how educate can adapt to AI technology…

ChatGPT isn’t the death of homework – just an opportunity for schools to do things differently

Daisy Daisy/Shutterstock

Andy Phippen, Bournemouth University

ChatGPT, the artificial intelligence (AI) platform launched by research company Open AI, can write an essay in response to a short prompt. It can perform mathematical equations – and show its working.

ChatGPT is a generative AI system: an algorithm that can generate new content from existing bodies of documents, images or audio when prompted with a description or question. It’s unsurprising concerns have emerged that young people are using ChatGPT and similar technology as a shortcut when doing their homework.

But banning students from using ChatGPT, or expecting teachers to scour homework for its use, would be shortsighted. Education has adapted to – and embraced – online technology for decades. The approach to generative AI should be no different.

The UK government has launched a consultation on the use of generative AI in education, following the publication of initial guidance on how schools might make best use of this technology.

In general, the advice is progressive and acknowledged the potential benefits of using these tools. It suggests that AI tools may have value in reducing teacher workload when producing teaching resources, marking, and in administrative tasks. But the guidance also states:

Schools and colleges may wish to review homework policies, to consider the approach to homework and other forms of unsupervised study as necessary to account for the availability of generative AI.

While little practical advice is offered on how to do this, the suggestion is that schools and colleges should consider the potential for cheating when students are using these tools.

Nothing new

Past research on student cheating suggested that students’ techniques were sophisticated and that they felt remorseful only if caught. They cheated because it was easy, especially with new online technologies.

But this research wasn’t investigating students’ use of Chat GPT or any kind of generative AI. It was conducted over 20 years ago, part of a body of literature that emerged at the turn of the century around the potential harm newly emerging internet search engines could do to student writing, homework and assessment.

We can look at past research to track the entry of new technologies into the classroom – and to infer the varying concerns about their use. In the 1990s, research explored the impact word processors might have on child literacy. It found that students writing on computers were more collaborative and focused on the task. In the 1970s, there were questions on the effect electronic calculators might have on children’s maths abilities.

In 2023, it would seem ludicrous to state that a child could not use a calculator, word processor or search engine in a homework task or piece of coursework. But the suspicion of new technology remains. It clouds the reality that emerging digital tools can be effective in supporting learning and developing crucial critical thinking and life skills.

Get on board

Punitive approaches and threats of detection make the use of such tools covert. A far more progressive position would be for teachers to embrace these technologies, learn how they work, and make this part of teaching on digital literacy, misinformation and critical thinking. This, in my experience, is what young people want from education on digital technology.

Children in class looking at tablets.
Young people should learn how to use these online tools.
Ground Picture/Shutterstock

Children should learn the difference between acknowledging the use of these tools and claiming the work as their own. They should also learn whether – or not – to trust the information provided to them on the internet.

The educational charity SWGfL, of which I am a trustee, has recently launched an AI hub which provides further guidance on how to use these new tools in school settings. The charity also runs Project Evolve, a toolkit containing a large number of teaching resources around managing online information, which will help in these classroom discussions.

I expect to see generative AI tools being merged, eventually, into mainstream learning. Saying “do not use search engines” for an assignment is now ridiculous. The same might be said in the future about prohibitions on using generative AI.

Perhaps the homework that teachers set will be different. But as with search engines, word processors and calculators, schools are not going to be able to ignore their rapid advance. It is far better to embrace and adapt to change, rather than resisting (and failing to stop) it.The Conversation

Andy Phippen, Professor of IT Ethics and Digital Rights, Bournemouth University

This article is republished from The Conversation under a Creative Commons license. Read the original article.