The Computational Ingestion Pipeline And The Frontiers Of Copyright Law: A Doctrinal Dissection Of Machine Learning Training, Judicial Reckoning, And Legislative Reform In India
- IJLLR Journal
- Jun 12
- 1 min read
Subhransu Sekhar Hota, Bennett University
ABSTRACT
The recent emergence of large-scale generative AI has brought about an institutional crisis between computation and intellectual property. The purpose of this article is to examine if the discrete technological acts involved in the modern machine learning training pipeline – data collection, transient storage, persistent storage, tokenization, preprocessing, and weight- embedding – constitute copyright infringement under the Copyright Act, 1957 alone, or collectively. The analysis takes a stage-by-stage approach, using a doctrinal mapping approach based on the principles of ‘granularity’ and ‘mapping’ and questioning each stage of the pipeline in relation to the statutory exclusive rights of reproduction and adaptation. It then examines the applicability of the current statutory defences, such as the fair dealing provisions in Section 52(1)(a) of the Copyright Act, 1957, and the non- applicability of the United States’ transformative-use doctrine in India’s closed statutory regime, and the implications of the statutory silence on text and data mining in India. The High Court of Delhi’s landmark generative-AI copyright case, the proceedings of which are followed from the reservation of judgment on 27 March 2026 to the submissions of the court-appointed amici curiae and the industry’s response, are discussed in particular. Lastly, the article challenges the Working Paper on Generative AI and Copyright recently released by the Department for Promotion of Industry and Internal Trade and exposes the administrative and economic shortcomings of the so called “One Nation, One License, One Payment” approach, suggesting a legislative alternative that is more balanced: a purpose-neutral text and data mining exception combined with machine-readable opt-outs and market- driven data cooperatives.
