Training AI Models On Copyrighted And Personal Data: Reconciling Fair Use And Privacy Rights
- IJLLR Journal
- 1 day ago
- 1 min read
Bedanta De, KIIT School of Law, KIIT University
Page: 8869 - 8885
1. ABSTRACT
Because of the fast advancements in generative AI, computers can now produce documents, pictures, music, and other content that resembles text written or made by people. A large portion of the information used to train these models is obtained from publicly available sources and contains a lot of copyrighted and personal data that people did not give permission for. It leads to legal debates regarding whether using such data is permitted by copyright law and privacy rules, especially since they are in the process of changing in India. Even though fair use and fair dealing are common arguments for AI training made by developers in the U.S. and India, they are coming under close inspection due to the privacy standards established by the Justice K.S. Puttaswamy v. Union of India which recognized the Right to privacy as a fundamental right under Article 21 of the Indian Constitution. Besides, India’s latest Digital Personal Data Protection Act, 2023 places stricter rules on gathering, processing, and consent to the company’s use of data, making it necessary to update AI systems. The paper examines the conflict between the use of copyrighted content in machine learning and the need to protect a person’s privacy, mainly when information in the data is sensitive. Looking at the differences among India, America, and Europe, the study offers a review of the regulations and describes what changes should be made. It supports openness in where the data is obtained, open consent policies, and policies that equally guard progress and rights in the age of AI.