Our May 2025 Spotlight comes from Hub member Devina Krishna. She is an assistant professor at the Department of English, Patna Women’s College, India. Devina earned her PhD in Linguistics from Jawaharlal Nehru University in New Delhi, specializing in Phonetics and Phonology. Her main research interests revolve around phonetics, phonology, and language documentation.
We reside in a world where language and technology go hand in hand and chances are slim to segregate them. Language was considered a dialect with an army and a navy and later there was a transformation of the prior sense of the definition. It has been said that language will cease to exist without technology. Technology plays a great role in being hand in glove with it. The present world offers a variety of scope for languages to exist.
India is an abode to many languages comprising scheduled, non-scheduled, and lesser-known languages. Many of the languages fall under the “endangered category,” representing distinct levels of endangerment. On the one hand, we notice that languages are ways for preserving the history and heritage of one’s nation in a simplified manner. On the other hand, if these languages face extinction, then that leads to the loss of many aspects in a worldly scenario. Therefore, linguists, language scientists, and language enthusiasts deal with the issue of language loss and devise means to preserve the vulnerable codes.
The present work is my original contribution in the area of Artificial Intelligence-led translation tools and the way these tools contribute to language preservation and potentially reduce language loss.
A survey was conducted that proved that out of various translation applications, Google Translate emerged as the dominant player in the translation tool market, but there remains room for other tools like Bhashini and Anuvadini to carve out their niche. The “others” category also highlighted the existence of alternative tools that cater to specific needs or preferences. The study also suggested that the vast majority of users were comfortable and content with translation tools, highlighting their effectiveness and user-friendliness. Furthermore, the study also indicates a strong consensus among users that translation application services offer more benefits than drawbacks, highlighting their perceived value and usefulness. A majority of respondents also believed that translation applications can aid in the usage and preservation of endangered languages.
One way is the intervention of technology through Artificial Intelligence (AI) that is being used to revive endangered Indian languages through machine learning algorithms that can transcribe and translate oral histories and stories from endangered language speakers. AI-powered technology allows for the preservation of valuable cultural knowledge that might otherwise vanish. AI technology helps through techniques such as: speech recognition, text-analysis, and image and audio processing. AI serves as Interactive Language Learning Platforms, and provides Speech Recognition and Pronunciation Practice, thereby strengthening the foundation for the protection of Indian linguistic heritage. While humans are the biggest protectors of human languages, artificial intelligence can renew and preserve linguistic codes.
Various platforms and initiatives have been designed that utilize AI-powered translation tools for endangered Indian languages. A few of these include:
1. AI4Bharat’s Indian Language Translation Platform: This platform develops AI- powered translation tools for endangered Indian languages
2. The Endangered Languages Project (ELP) India: This platform also leads to safeguarding endangered codes through AI powered translation tools.
3. Bhasha’s Indian Language Translation Platform: Another platform of this kind develops AI-led tools for Indian languages4.
4. The People’s Linguistic Survey of India (PLSI): This is a platform that aims to employ AI- led translation tools for documentation and research.
5. Google’s Endangered Languages Project: This application supports language preservation and documentation through the AI-powered tools of translation.
Santali is one of the vulnerable languages supported by AI-powered translation tools. Google Translate, one of the leading AI-led translation tools, supports Santali language translation (text and speech). Another platform, AI4Bharat’s Indian Language Translation Platform, includes Santali language support for machine translation. Also, villagers in Karnataka are among thousands of speakers of diverse Indian languages producing speech data for the tech firm Karya, which is making datasets for firms such as Microsoft and Google to use in AI models for the sectors of education, health and others as well. The government of India is working tirelessly in producing digital services and thereby crafting linguistic datasets through Bhashini aimed at creating open-source datasets in regional languages for producing AI tools and applications. Moreover, the government’s initiative aims to deliver more services digitally, ensuring citizens can access digital government services and information in their native language.
The significance of the present study lies in its exploration of how AI-powered translation tools can preserve vulnerable Indian linguistic codes through certain human-made tools and projects. By investigating the impact of the AI-powered translation tools on the preservation and promotion of local languages, the study aims to highlight the potential for these technologies to bridge communication gaps and to substitute inclusivity in a plurilingual society. Furthermore, this study aims to comprehend the role of AI-driven language translation tools in diminishing the risk of language loss by the project practice and documentation at length. Also, the study will provide a rich source of data for future linguistic researchers, language enthusiasts and language scientists. The study is a confirmation towards delivering significant information towards the linguistic fraternity to confirm that technology within the Indian subcontinent leads to the maintenance of the linguistic ecosystem.
References:
Arora, V. (2020). Artificial intelligence and language preservation: A systematic review. Journal of Language and Linguistics, 19(4), 1050-1071.
Hristov, K. (2017). Artificial Intelligence and Copyright Dilemma. University of Illinois, Chicago.
Jain, A., & Jain, S. (2020). AI-powered translation tools for Indian languages: Challenges and opportunities. Proceedings of the 2020 International Conference on Artificial Intelligence and Language Processing (AILP), 123-128.
Jha, S. K. (2020). AI-powered translation tools for endangered languages: A case study of Santali language. In Proceedings of the 17th International Conference on Natural Language Processing (pp. 345-353).
Kulkarni, A. (2020). AI-powered tools for endangered language preservation. In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 3421-3428).
Kumar, A.,& Sharma, D. (2019). AI-based language preservation and revitalization: A case study of the Thadou language. Language Documentation & Conservation, 13, 101-125.
Kumar, A., & Singh, P. (2019). Machine translation for Indian languages: A review. International Journal of Advanced Research in Computer Science, 10(2), 309-316.
Mishra, A. (2019). Language preservation through technology: A case study of an endangered Indian language. Language Documentation & Conservation, 13, 46-64.
Patel, H. (2020). Developing AI-powered translation tools for endangered Indian languages (Master’s thesis). Indian Institute of Technology, Kanpur.
Rehman, A. (2020). Language preservation and AI: A critical analysis. Routledge.
Singh, R. (Ed.). (2019). Language and technology: Perspectives from India. Cambridge University Press.
UNESCO. (2019). Language preservation and AI: A report on the use of artificial intelligence in language preservation efforts. UNESCO Institute for Statistics.