OpenAI, which operates the artificial intelligence chatbot ChatGPT, told the Delhi High Court on Wednesday that it cannot be accused of copyright infringement in India since it stores the data and trains its Large Language Model (LLM) outside India [ANI Media Pvt Ltd v. Open AI Inc & Anr]
Senior Advocate Amit Sibal made the argument on behalf of OpenAI before the bench of Justice Amit Bansal during the hearing of ANI's copyright infringement suit against OpenAI.
The training of the LLM does not take place within India and storage of training data also does not take within India and thus the Indian Copyright Act does not extend to it, Sibal said.
He explained that the Indian Copyright Act extends only to the territory of India and does not have extra-territorial application. The senior counsel was addressing ANI's argument that OpenAI was storing the news agency's data to train ChatGPT.
"Any action that is claimed to be infringement must take place in the four corners of this country. If it doesn't take place in the four corners of this country, that particular action can't be infringement. My lords are entitled to decide whether it has taken place in this country or not. If my lords come to the conclusion that it hasn't taken place in the four corners of this country, then it's not infringement and that plea would have to be rejected," Sibal argued.
Sibal said he would separately deal with the fact that ChatGPT is accessible in India. This is only on the storage aspect, he clarified.
Sibal also submitted that there is no evidence that OpenAI stores any data in India and thus it must be concluded, it is stored outside India. He further submitted an LLM learns through training and its memorization does not mean that some data is stored and it has access to the data.
"There is no expressive use of their (ANI's) content, no evidence of repository of stored content," Sibal said, adding the information to train ChatGPT does not come from one individual work but vast data.
Sibal also said no foreign court so far has found infringement by ChatGPT. He went on to argue that even if Indian Copyright Act is applied, infringement will not be found.
"The storage is only an intermediate step toward use of non-expressive elements of text corpus. Copyright Act only protects expression, it does not protection ideas or facts," he added.
ANI had earlier argued ChatGPT has been infringing on its copyright not only by taking content directly from its website but also by scraping content that is shared with the news agency's subscribers.
In its suit before the High Court, ANI has claimed that its original content is being exploited by OpenAI for commercial gain and to train its AI chatbot, ChatGPT to answer user queries.
Today, Sibal said ANI cannot claim payment in the form of a license fee when ChatGPT's response to a user query comes from training of the LLM based on hundreds of millions of works.
"The plaintiff says that he is entitled to a license for that expression that is ultimately based on training from linguistic structures and grammar in the public domain and owned by all and derived from millions of persons other than the plaintiff," he said.
Sibal also said that verbatim reproduction of a work is almost an impossibility.
"There is no regurgitation that has been shown and no memorization that has been shown in this particular case and it won't be because today regurgitation is actually a failure of the system. It is extremely rare. It is what we don't want to happen, because the whole purpose of the large learning mode, is to generate new responses of its own, and not to generate other people's responses, and that is why regurgitation is frowned upon," he told the Court.
He will continue arguments on next date of hearing.
On November 19, the Court had issued summons to Open AI. Advocates Adarsh Ramanujan and Dr Arul George Scaria were also appointed as amici curiae in the case. They have made their arguments in the case.
The Court is considering the following four issues:
I. Whether the storage by Open AI of ANI’s data (which is in the nature of news and is claimed to be protected under the Copyright Act, 1957) for training its software (ChatGPT) would amount to infringement of ANI’s copyright.
II. Whether the use by Open AI of ANI’s copyrighted data in order to generate responses for its users, would amount to infringement of the news agency’s copyright.
III. Whether Open AI’s use of ANI’s copyrighted data qualifies as ‘fair use’ in terms of Section 52 of the Copyright Act, 1957.
IV. Whether the courts in India have jurisdiction to entertain the present lawsuit considering that the servers of the Open AI are located in the United States of America.