

Most litigation teams recognise the "search tax": the hidden time, cost, and cognitive burden of locating specific evidence - a scanned invoice, a site report, or a bank statement—buried within hundreds of pages. The cost of search is not merely lawyers’ hours; it is the latent risk of missing the single document that alters the trajectory of a case. Grossman and Cormack’s influential study in the Richmond Journal of Law and Technology observed that technology-assisted review can be more effective than exhaustive manual review. In Indian litigation, however, this burden is compounded by records that are non-digitised, multilingual, and inconsistently indexed.
The scale of this inefficiency is systemic. India faces over 5.46 crore pending cases. The problem is not merely the volume of documentation, but the fragmentation of the document layer itself. A typical case file contains a chaotic mix: English pleadings, Hindi evidence, Marathi revenue records, handwritten notings, old stamps, scanned bank statements, and annexures pulled from multiple proceedings.
These realities point to a fundamental shift in legal tech priorities. While document drafting is a vital output, the primary operational obstacle remains the integrity and accessibility of the underlying record. Before a lawyer can draft a pleading, advise on risk, or cross-examine a witness, they must first reconstruct fact patterns from messy, multi-source records. In this environment, verification is the prerequisite for effective advocacy.
This is the evidence-reconstruction challenge that legal document intelligence must address - and the problem Bharat.Law’s Document Intelligence framework is built around.
In Indian litigation, the transition from document to evidence is inherently non-linear, transforming what should be routine retrieval into a complex exercise of forensic reconstruction. A truly functional system must look beyond standard search capabilities to accommodate the practical reality of Indian legal records—a landscape characterised by abrupt multilingual shifts, degraded scan quality, and significant issue drift across disparate proceedings.
The "search tax" accumulates most heavily at these friction points. It is felt acutely when practitioners must decipher engineering abbreviations and handwritten annotations in site reports, or when appellate records force a reconciliation between born-digital pleadings and fragmented, poor-quality legacy annexures. In such an environment, the failure mode for legal technology is not an inability to scan a document, but an inability to render that document usable for active litigation.
Consequently, the mandate for an effective platform is not speed alone, but verifiability. Any system that claims to resolve the search tax must enable practitioners to move seamlessly from a high-level insight to the precise exhibit page. This traceability ensures that every summary or claim is firmly anchored in the source material, providing the rigorous audit trail that courts and counsel demand.
The burden is most visible in commercial arbitration, where the dispute is rarely contained within the contract alone. Claims regarding delay, variation, payment, and termination are usually scattered across measurement books, site instructions, drawings, approvals, progress reports, invoices, correspondence, and expert reports. This high-stakes arbitration environment underscores why efficient search is not merely an administrative convenience; it is central to claim strategy, risk assessment, and effective settlement.
In insolvency, the problem is equally document-heavy. A single CIRP may require lawyers and resolution professionals to connect loan documents, security papers, account statements, claims, CoC minutes, valuation reports, forensic findings, and NCLT filings. With IBBI reporting over 8,000 admitted CIRPs and extended resolution timelines, the cost of finding and verifying the right document directly impacts recovery.
Tax, regulatory, and white-collar matters introduce a different layer: the evidence is not only textual, but transactional. GST, customs, and direct tax disputes often require reconciliation across notices, replies, returns, ledgers, invoices, and appellate records. Similarly, SEBI, competition, and anti-corruption matters require pattern reconstruction across emails, board papers, bank statements, and audit trails. Here, the missed fact is often not a paragraph, but a specific transaction or timeline hidden across documents.
Land and property disputes are perhaps the most distinctively Indian example. Title and encumbrance questions may depend on sale deeds, mutation entries, revenue records, survey maps, tax receipts, and local-language government documents accumulated over decades. Across all these verticals, the common problem is clear: Indian legal teams are not merely searching for documents; they are reconstructing evidence from fragmented, historical records.
Traditional search was designed for finding files, not proving facts. It succeeds only when the user knows the exact keyword, the text has been captured perfectly, and the document uses the same language as the query. Legal records rarely behave this way. A delay claim may not use the word “delay”; a default may appear as a missed instalment, a recall notice, or an admission in correspondence.
While OCR search improves access, it does not solve the legal problem. It converts scanned pages into searchable text, but it does not understand whether a date is material, whether an amount supports a claim, or whether a handwritten endorsement changes the meaning of a record. In Indian files, OCR is merely the starting point.
Generic AI introduces a different risk. It can summarise and answer fluently, but fluency is not synonymous with reliability. In legal work, an answer is only useful if it can be traced back to the source: the document, page, paragraph, and context. A summary without provenance merely shifts the burden back to the lawyer, who must still verify whether the answer is accurate and grounded in the record.
The deeper limitation is that most tools do not operate at the level of legal workflow. Lawyers need chronologies, issue-wise evidence maps, contradiction checks, financial figures, and entity relationships—not just search results. They need to move from a fragmented record to a coherent theory of the case.
The search tax will not be solved by adding another search bar to legal files. Indian legal teams need a different workflow: one that can read the entire record, preserve context across pleadings and annexures, and convert scattered documents into usable legal understanding.
A litigation file should not be treated as a folder of PDFs, but as a connected case record. This is the direction in which legal AI must evolve. Lawyers should be able to build chronologies, identify key facts, and move from every insight back to the exact source page. In legal work, traceability is not a feature; it is the condition for trust.
Bharat.Law’s Document Intelligence framework is built around that reality. It is designed for litigation-scale case files running into tens of thousands of pages, without requiring lawyers to manually prepare the record. It addresses the realities of the Indian record: multilingual documents, mixed digital and scanned files, and inconsistent formats.
The next benchmark for legal AI in India should be practical: can it help a lawyer find the right evidence faster, understand why it matters, and verify it at the page level? If it can, it does more than answer questions. It reduces the search tax that sits at the heart of document-heavy legal work.
About the author: Nimit Kumar is the Founder of Bharat.Law, a full stack AI Litigation and Legal Assistant, purpose-built for India.
Disclaimer: The opinions expressed in this article are those of the author(s). The opinions presented do not necessarily reflect the views of Bar & Bench.
If you would like your Deals, Columns, Press Releases to be published on Bar & Bench, please fill in the form available here.