Creating a hub for info on the judicial system: The Agami Data for Justice Challenge

Creating a hub for info on the judicial system: The Agami Data for Justice Challenge

Aditya AK

Over the years, different organisations have undertaken studies to map out data related to the Indian legal system. While such studies have gone a long way in helping us understand the issues that plague the system, no attempt has been made to collate and streamline the data obtained through them. And now, the Agami Data for Justice Challenge attempts to do just that.

Launched on July 15, the Data for Justice Challenge invites individuals and organisations to create a data hub.

Agami co-founder Supriya Sankaran delves into the details.

What is the idea behind the Data for Justice Challenge?

We heard several individuals and organizations (researchers, think tanks, journalists, startups) emphatically sharing both: the opportunities and the difficulties in accessing legal datasets. Datasets relating to cases and matters relating to our systems of law and justice from public and private sources such as courts, tribunals, National Crime Records Bureau (NCRB), and surveys shed light on how our courts function, the impact of laws and judgments, and offer insights into crime, litigants, and cases. They can help us hold our legal institutions accountable.

However, accessing such data is today both very time and cost-intensive, when in fact, it should be a public good: a community resource accessible to all. 

We saw the opportunity in the fact that many researchers, organizations, universities, journalists, etc have invested in creating such valuable datasets. We saw that datasets that are so painfully created, and often draw on public sources aided by philanthropic/public capital, are used only once: by the teams creating them to publish their one-story, report or paper. 

So that Data for Justice Challenge was framed to leverage all the existing efforts to create the greatest multiplier effect, in the leanest possible way. 

Pursuant to deep engagements with over 130 actors invested in catalysing a data-driven future in law and justice, we framed the Challenge goal to enable each of us to share, build-upon and co-create datasets in law and justice.

We believe if we break out of our silos and collaborate on each other’s datasets, we could create a resource that can grow the community of data-users and generate insights for impact. Drawing from the open data movement in other sectors including science, we are convinced that openly accessible datasets can lead to many different users, including students, using data for a great variety of purposes. Wider access to datasets will increase the return on investment in each initiative and the field as a whole.

What are the qualities you are looking for in the creator of the hub/data sets?

We are looking for entrepreneurial and creative individuals committed to the vision of building an open legal data movement in India. They need to be able to weave in the technology with the processes, tools, standards, and incentives to enable existing and future authors of datasets to trust the hub.

What would be some of the challenges surrounding the collation and arrangement of data?

There would be many aspects that need to be carefully designed at both the level of technology and community curation. Given that data is classified differently across sources, technology will have to be designed to enable the cataloging, interoperability, and analysis of large datasets. This will have to be weaved in with the creation of incentives for authors to share datasets. The hub will have to evolve standards for data documentation and sharing to inspire trust and credibility.

Is the idea limited to collating existing sets of data, or is there an endeavour to go about collecting new data as well?

It would surely include collecting new data continuously over the next several years. Track 2 of the Challenge invites applicants to create new datasets/data-driven projects and commit to sharing it on the hub. The vision is that authors of future datasets will seek to host their datasets on the hub.

Since a lot of the existing data on pendency, delays etc comes from the government, is there a plan to work with them as well?

We sure hope so! In the first instance, the Challenge calls on each of us to first act to share and co-create open legal datasets. But we recognize that the government and the Judiciary can play the biggest role here.

Through the Open Government Data Platform and opening of 10,808 APIs, the Indian government has opened its data for many applications. However, only two belong to the judicial sector. Opening more APIs in relation to judicial data sets would therefore further the commitment to strengthen citizen trust in the system and enable citizen participation and engagement in decision-making.

There is a great opportunity to be unlocked by offering API access to e-court data. And there is growing momentum around this issue now. This needs deep engagement with a range of actors such as researchers, academicians, data and privacy experts, and technology experts. It is now the right time to host discussions around the process of providing API access while safeguarding the public interest. We have to work collectively with the Judiciary and the government to facilitate open access.

How can such a hub potentially address the issues plaguing the justice delivery system?

In 2014, Rukmini Srinivasan reported that 30% of all sexual assault cases filed before Delhi District Courts in 2013  dealt with consenting couples whose parents had accused the boy of rape and another 20% dealt with “breach of promise to marry”.

In 2017, when Stayzilla’s founder was arrested on criminal charges, Alok Prasanna shed light on the increasing use of criminal cases to try and resolve a civil dispute: the number of cheating cases have doubled between 2006-15. In 2018, Susan Thomas & Ajay Shah sifted through data and showed that the probability of  a case closing in the 180 days prescribed by the Insolvency and Bankruptcy Code is 0.9!

In the same year, Apar Gupta and Abhinav Sekhri brought to our attention that despite Section 66A of the Information Technology Act, 2000 being struck down by the Supreme Court, it continued to be used across the country to arrest citizens.

If the hub succeeds, there will be many more such insights that help us better understand our legal system and identify opportunities for action. More importantly, such datasets should feed into the creation of systemic feedback loops and accountability. Further, it will help us take a data-driven approach to judicial and legal reform.

Bar & Bench also spoke to Director and Co-founder of CivicDataLab, Gaurav Godhwani, to garner views from a potential user of the hub.

What do you see as the value of a hub like this? Why is it needed?

The state of judicial data in India has been quite complex; legal researchers painstakingly go through numerous access restrictions to gather data scattered among different court websites and work hard to standardize to make this data usable to draw their analysis.

There is still no standard way to acquire, clean, and share their data, making all this hard work lost in silos. This becomes a major roadblock to analyse key issues like case pendency, implementation of laws, court workload, and more, especially when we move further to local courts.

The Agami Data for Justice Challenge will be the first of its kind initiative to enable co-creation and sharing of judicial data in India. It will bring together a diverse set of people working in the judicial space including legal researchers, media persons, academicians, technologists, policy makers, and governments to publish invaluable data.

It will build a community that can collaborate and develop data-driven insights about how our courts are performing on various topics of national and local importance. The potential of creating a platform like the Data for Justice hub is huge and will positively impact judicial data exchange in the country. 

What will be the most critical things to get right?

There are a variety of challenges to work on this space, but personally, these are the most pressing ones at the moment:

Data Privacy & Ethics: Various court websites publish a lot of Personally Identifiable Information (PII) through their orders and judgments. This PII generally includes the name and age of parties and witnesses involved. Sometimes, it also covers their caste, mother tongue, residential address, and more. Once such data is collected and published online in bulk, this can pose a potential threat to the right to privacy of the people involved, along with several ethical challenges on processing the related data for ongoing cases as it may have a direct affect on court proceedings.

Thus, it would be quite critical to build a comprehensive and evolving data ethics, privacy and security framework for co-creating and publishing these high-value datasets.

Enabling Co-creation: The major benefit of a platform like this would be an opportunity to collaborate with other individuals and organizations to co-create data. Various researchers from small and mid-level organizations or those working independently, would now have an opportunity to collaborate with a wide community of technologists and data scientists to automate some of the mundane tasks of data plumbing.

Similarly, legal researchers can guide others about the judicial intricacies of this data and ensure effective data-based storytelling. Thus, it would be quite critical for hub to brainstorm ideas and come up with a concrete plan to encourage users to collaborate and contribute data. 

Judicial Data Wiki/Guidebook: The current judicial space has plethora of legal jargon, legislative changelogs, various amendments, regional laws and more. While we are working to build this community hub of judicial data, it would be essential to leverage knowledge around judicial processes and come up with integration of a comprehensive Judicial Data Wiki or Guidebook including glossaries, data dictionaries, changelogs and more to help community better adopt this data.    

What would you as a potential user look for?

As a potential user, I am really excited to understand and participate in various use-cases and the underlying data our peers in the Judiciary space are working on at the moment. CivicDataLab has been working for the last couple of months on generating key judicial datasets related to child rights protection implementation, digital rights, city-level case mappings and more. We are excited to share these with the wider community and get their feedback. We eagerly look forward to co-create, share and analyse more granular and timely data on our judicial ecosystem through Agami’s Data for Justice Challenge.

Agami had previously launched the E-ADR Challenge to set up an institution that will offer e-arbitration services.

Applications for the Agami Data for Justice Challenge close on August 10. The winning teams will be announced on September 29.

Bar and Bench - Indian Legal news