

A committee constituted by the Department for Promotion of Industry and Internal Trade (DPIIT) has released a working paper recommending that AI developers must fairly compensate the copyright holders for using their content to train Large Language Models (LLMs).
DPIIT, which comes under the Ministry of Commerce & Industry, had formed the committee to examine the intersection of AI and copyright.
The panel headed by DPIIT Additional Secretary Himani Pande and comprising various legal and tech experts was tasked with examining whether the existing legal framework on copyright adequately addresses the issues raised by the new technology or amendments are required.
A working paper outlining the committee's analysis and recommendations has now been released on the DPIIT website for consultation with public and stakeholders.
The comments or feedback to the same can be provided to DPIIT on email id "ipr7-dipp@gov.in" within 30 days from December 8.
In their consultations with the committee, a majority stakeholders from the Al Industry had advocated for a blanket Text and Data Mining (TDM) exception that would enable training of AI on copyright-protected works. However, the representatives of the content industry unanimously advocated for a voluntary licensing model.
In the report released on December 8, the panel said that the TDM exception model recommended by the tech industry would not be a prudent policy approach as it would undermine copyright and leave creators powerless to seek compensation for use of their works in AI training.
"It was not found to be a wise policy choice, especially for a country like India which has a rich cultural heritage and a growing content industry with immense potential," the paper said.
Even the option of opt-out for copyright holders was found to be insufficient in achieving the necessary balance. The committee said that such approach would leave small creators largely unprotected owing to lack of awareness, bargaining power to negotiate, and the mechanisms to see if their content has been scraped despite opt-out.
Importantly, the committee opined that this model may also limit the availability of broad and representative datasets for AI training, especially if many rights holders choose to opt out.
Thus, under the framework suggested by the panel, the rights holders will not have the option to withhold their works from use in the training of AI systems. The committee has recognized that access to large volumes of data and high-quality data is crucial for AI development.
"With a majority view, the Committee decided to recommend a mandatory blanket license in favour of AI Developers for the use of all lawfully accessed copyright-protected works in the training of AI Systems, accompanied by a statutory remuneration right for the copyright holders," the paper states.
In order to achieve the same, the panel has a recommended a hybrid model to balance the rights and industry demand.
The key features of the hybrid model are:
1) Availability of all lawfully accessed copyrighted content for AI Training as a matter of right, without the need for individual negotiations;
2) Reduced transaction costs for AI Developers;
3) Reduced compliance burden on AI Developers;
4) Fair compensation to copyright holders;
5) Judicial review over royalty rates established;
6) Easy and straightforward process of payment to rightsholders;
7) Mitigated risk of AI bias and hallucinations; and
8) Level playing field for all, including start-ups and small players.
The panel suggested that a centralized non-profit entity of associations of rights holders can be designated by the Central government under the Copyright Act. It would be responsible for collecting the payments from the AI Developers and then distributing these among their members.
Only one member per class of work would be allowed and that would either be the copyright society for that category registered under Section 33 of the Copyright Act, 1957 or a not-for-profit Collective Management Organization formed by rights holders of a relevant class of work with broad representation, the panel has recommended.
"Certain percentage of the revenue generated from AI Systems trained on copyrighted content would be payable as royalties. The royalty rates would be fixed by a committee appointed by the government. By preserving the right of the copyright owners to receive royalties and administering it through a single umbrella organization made by the rights holders and designated by the movement, the model aims to provide an easy access to content for AI Developers for AI Training, simplify licensing procedures, reduce transaction costs, ensure fair compensation for rightsholders," the paper states.
The entity can be called the 'Copyright Royalties Collective for AI Training (CRCAT)', the panel has suggested.
The distribution of royalty would be based on a Works Database where creators register their material, as per the panel.
AI developers would also be required to submit a “sufficiently detailed” disclosure of training datasets, covering categories, nature and broad sources of content. AI developers would have to follow a basic disclosure rule - they must give a summary of the types of content used for training. However, they do not need to reveal technical details or confidential data.
Pertinently, National Association of Software and Service Companies (Nasscom) expressed its dissent regarding the hybrid approach. Rights holders should be provided clear statutory protection against Text and Data Mining (TDM), the Nasscom said.
Besides IAS officer Himani Pande, Simrat Kaur, Anurag Kumar , Advocates Ameet Datta and Adarsh Ramanujan, Raman Mittal, Chockalingam M and Sudipto Banerjee were also part of the panel.
Notably, the question whether AI developers like OpenAI, which operates ChatGPT, can access copyrighted content without any exception and payment is currently pending before the Delhi High Court in the case of ANI Media Pvt. Ltd. v. Open AI Inc.
ANI has sued OpenAI for copyright infringement, alleging that ChatGPT was trained using ANI’s content without its permission. It is the first case of this nature in India.