[Vidhispeaks] Exploring community data rights over non-personal data

To reject the promise of the concept of Community Data Rights because of the way it has been used in the Report of the expert committee constituted by IT Ministry, would be to throw the baby out with the bathwater.
[Vidhispeaks] Exploring community data rights over non-personal data

By Aniruddh Nigam

This year, a Committee of Experts constituted by the Ministry of Electronics & IT released a report (Report) proposing a framework for the regulation of non-personal data. The Report, despite some substantial gaps in its conceptualisation, asserts a clear policy intention – economic value generated from the processing and use of non-personal data should not be concentrated in private technology companies based in Silicon Valley, and citizens and communities in India should get a ‘fair share’ of this economic value and a ‘role to play’ in how this data is used.

In essence, this means the creation of data-sharing mandates where large technology companies would be required to share anonymised ‘high-value datasets’ with entities (data trusts) identified by the government, who would then make this data available for wider use by startups, domestic businesses and researchers.

The principled basis for this policy, as per the Report, is the realisation of the ‘community data rights’ of the people of India – a phrase used in the Report to refer generally to the interests that a community has in any data that is generated by them.

Despite the clear policy intention, the broad strokes used by the report have led to significant debates in Indian technology policy.

The Report poorly defines key phrases like ‘community data rights’, ‘data trusts’ and ‘high-value datasets’, leaving large parts of what this framework will actually look like up to imagination.

Commentators claim that many of these concepts need to have precise meaning and real-world examples before the implications of this policy on the digital economy can fully be understood – though there is perhaps an emerging consensus that the objectives of the Report are desirable. At the same time, some critics say that the Report proposes the ‘nationalisation’ of data – a policy which according to some is a poorly disguised data-grab to help large domestic players in the shroud of vaguely defined phrases like ‘data sovereignty’ and ‘community data rights.’

In this context, it is important to examine the key concepts forming the basis of this policy – interestingly, the phrase ‘community data rights’ which underpins the Report’s policy objectives is a somewhat novel phrase in academic and policy discourse.

In examining this phrase, I propose that the critics are partly right. The phrase ‘community data rights’ is essentially hollow – operating in the Report as a framing device for Indian developmental interests, or as more charitably put, as a ‘legal innovation’ to articulate these interests.

It is also a useful framing device which should not be rejected simply because it lacks clearly identified theoretical antecedents – it is unlikely that concepts aiming to translate developmental interests of the ‘global South’ will find antecedents in an academic discourse largely dominated by Western academics.

Despite the hollowness imbued into the term by the Report, I suggest that the concept of ‘community data rights’ can be a potent framing device for a call for distributive justice in a digital economy which is overly skewed in favour of large technology companies.

Community data rights in the context of digital economy

To understand why this concept has value, it is important to understand the role of non-personal data in the digital economy. Conversations about data governance have often focused on the protection of privacy and autonomy in relation to personal data, leading to a well-defined understanding of the interests that an individual has in their personal data being reflected in data protection laws.

However, non-personal data (such as anonymised datasets, sales data, weather data etc) has not been subject to the same level of analysis. Therefore, even though over the last few years, non-personal and anonymised datasets have increased in economic significance given their role in developing artificial intelligence and training machine learning models, conversations about regulation of non-personal data are relatively nascent.

Most legal systems and data markets evolved to view non-personal data through a propertarian lens, as a ‘resource’ owned by its collector. The traditional view of data has also been that it is a non-rivalrous resource, that is, one person using data does not prevent another from simultaneously using it. As a result, most jurisdictions did not strictly regulate the collection of non-personal data or view the economic benefits of non-personal data through a distributive lens.

This understanding is being challenged by three ideas: First, the conceptualisation of data as a ‘resource’ has been questioned by scholarship which finds that treating data as a resource undermines the role played by the people and communities who are involved in the generation of data and reduces their agency in relation to this data – for example, aggregate sales data held by an e-commerce platform is generated as a result of the economic activity between consumers and sellers, who are not recognised in relation to the ‘resource’;

Second, the concept of ‘ownership’ of data has been criticised for giving data collectors the exclusive privilege of determining the use of data to the detriment of other stakeholders, leading to calls for ‘polycentric’ governance of data.

Third, even though data is technically a non-rivalrous good which can be collected and used by many people, most data collection today is embedded in closed systems and proprietary platforms, in many cases, based outside India. This leads to the concentration of competitive advantage based on this data in companies with established monopolies, which use this data for private gain and to entrench their position. There is no incentive for a company to share this data, and it would simply not be possible for a bootstrapped startup to collect data on the same scale as giants like Amazon. Consequently, even though it is the economic activities of the citizens and businesses in India that generate sales data for a platform like Amazon, they have no say in how this data is used, this data can neither be obtained or used by domestic businesses, nor is it put to uses which are aligned with the developmental interests of Indian citizens and businesses.

The call for ‘community data rights’ over non-personal data, when viewed in this context, is simply a call for distributive justice in data governance. It is also not an entirely novel claim being made by India – the same developmental interests were articulated in the 2019 Draft E-Commerce Policy using the language of “societal commons” and “national resources”, and in the 2019 Economic Survey Report which claimed that anonymised datasets of personal data were “public goods” –concepts which were not appropriately fleshed out or analysed.

However, these claims for distributive justice, in many situations, do flow from well-established legal principles such as the public trust doctrine or the principles of self-determination – in which case, the construct of ‘community data rights’ can operate as a useful framing device for those principles. The key analytical task then, which the Report failed to do, is identification of these principles in a conceptually sound manner.

Conceptualising community data rights

This context-setting finally leads to the key question: what are ‘community data rights’? Interestingly, there are some important and unresolved debates for each of the three words which make up this phrase.

The first word perhaps leads to the most intractable debate – how should a community be defined? How would a community collectively hold rights, and how would interests of members within a community be prioritised? While there are no formulaic answers to these questions, the mere nebulousness of the concept of a ‘community’ should not be a nail in the coffin for a participatory vision of data governance.

It is almost dogmatic to state that communities are the result of complex social relationships, and identification within a community is not rigid but is a fluctuating outcome of economic, political and social positions. Despite this inherent complexity, it is not desirable to jettison the idea of community-centric governance altogether.

Elinor Ostrom and Sheila Foster’s work on establishing commons-based governance frameworks and ‘knowledge commons’ has demonstrated that it is possible to build community-level governance solutions which are equitable, just and efficient.

Unfortunately, the Report appears to have fallen for the low-hanging fallacy of implicitly assuming that what vests in the community, would ultimately vest in the State – since the community acts through the State. While there is an intuitive appeal to this reasoning, it is important to recognise that for participatory governance mechanisms to be meaningful, roles and interests must be delineated in a way that allows effective representation of interests – which necessarily requires more granular identification of communities.

There are also unresolved questions about precisely which data would the community have rights over, and what the content of the rights specifically would be. The Report, in its revised version, tailored its scope from all non-personal data to ‘high-value datasets’.

A look at data-sharing mandates across the world, such as in Finland, shows that a lot more precision is necessary to be aligned with international practices.

The Report also fails to provide sufficient clarity to the content of the rights in question. Presumably, these rights extend to access to the data and controls over the purposes for which data is used, though the specific mechanisms through which these controls will be deployed are unclear from the Report.

The Report’s use of ‘community data rights’ leaves a lot to be desired – yet the potency of this construct in anchoring demands for distributive justice should not be understated. The framing of ‘community data rights’, built on a conceptually sound foundation, would represent an evolution of Indian jurisprudence on data governance which is aligned to our unique developmental interests and position. To reject the promise of this concept because of the way it was used in the Report would be to throw the baby out with the bathwater.

Way forward

- The construct of ‘community data rights’ has the potential to be a framing device for distributive justice in the context of the digital economy. It is key that this construct be fleshed out in a conceptually sound manner, drawing from well-established legal principles such as the public trust doctrine and the principles of self-determination.

- Mere recognition of ‘community data rights’ would not by itself remedy the imperfections of the digital economy. This must be complemented with the creation of other institutional arrangements – such as the development of standards and protocols for data sharing and the creation of data trusts and exchanges

- The Report of the Committee of Experts has sparked an important conversation for Indian technology policy, but concepts such as ‘community data rights’ and ‘data trusts’ require sustained engagement and refinement in order to herald meaningful changes in the digital economy.

Aniruddh Nigam is a Research Fellow at Vidhi Centre for Legal Policy.

Vidhispeaks is a fortnightly column on law and policy curated by Vidhi. The views expressed are of the fellow and do not reflect the views of Vidhi or Bar & Bench.

Bar and Bench - Indian Legal news