Soket AI Labs Partners with Google Cloud to Launch Open-Source Multilingual Model Pragna-1B

Soket AI Labs Partners with Google Cloud to Launch Open-Source Multilingual Model Pragna-1B
Soket AI Labs and Google Cloud launch Pragna-1B, an open-source multilingual model supporting Indic languages, enhancing AI's reach in India.

Soket AI Labs has unveiled Pragna-1B, an open-source multilingual language model designed to cater to the rich linguistic diversity of India. In collaboration with Google Cloud, this model aims to bridge the gap in AI language models for Indic languages, providing robust support for Hindi, Gujarati, Bangla, and English.

Overview of Pragna-1B

Pragna-1B is a Transformer Decoder-only model with 1.25 billion parameters and a context length of 2048 tokens. Developed over six months, the model’s training involved 150 billion tokens and extensive computational resources, including 8000 GPU hours on NVIDIA A100 systems​.

Training and Development

The development of Pragna-1B involved several key steps:

  1. Embedding Alignment: Initially, only the embedding and lm_head were aligned, keeping other tensors frozen. A parallel sentences dataset from Bhasha-wiki, pairing sentences in six Indian languages with their English counterparts, facilitated this alignment​.
  2. Continual Pretraining: All 1.25 billion parameters were enabled for further training, focusing on Hindi, Bangla, and Gujarati due to computational constraints. The model processed approximately 150 billion tokens over 8000 GPU hours, maintaining high sampling probabilities for these languages​.
  3. Instruction Fine-Tuning: The model underwent supervised fine-tuning across various tasks such as conversation, question-answering, summarization, and paraphrasing. This step incorporated over 13 million instances of instruction-response data from multiple sources including Bhasha-SFT, Indic-align, and Samvaad.

Ethics and Safety Alignment

Soket AI Labs places a significant emphasis on ethical AI. The model’s fine-tuning includes specific datasets designed to prevent the generation of unethical or harmful content. This focus on safety and ethics is crucial for ensuring that Pragna-1B aligns with human values​.

Community Engagement and Future Plans

Soket AI Labs plans to release Pragna-1B under an open-source license, inviting feedback from the community to refine and enhance the model further. An initial research preview of the instruction-tuned model is available via a chat interface, although it is not recommended for production use due to potential factual inaccuracies​.

Significance and Potential

Pragna-1B represents a significant advancement in the field of AI language models for Indic languages. By focusing on linguistic inclusivity and ethical AI practices, Soket AI Labs aims to contribute to the broader AI community and enhance user engagement across diverse linguistic landscapes.

The collaboration with Google Cloud underscores the importance of leveraging advanced cloud infrastructure to develop and deploy AI models efficiently. As AI technology continues to evolve, models like Pragna-1B are poised to play a crucial role in making AI accessible and useful for a wider audience.

Tags

About the author

Avatar photo

Shweta Bansal

An MA in Mass Communication from Delhi University and 7 years in tech journalism, Shweta focuses on AI and IoT. Her work, particularly on women's roles in tech, has garnered attention in both national and international tech forums. Her insightful articles, featured in leading tech publications, blend complex tech trends with engaging narratives.

Add Comment

Click here to post a comment

Follow Us on Social Media

Web Stories

5 Best Phones Under ₹15,000 in November 2024: Vivo T3x 5G, Redmi Note 13 5G and More! Best Camera Phones Under ₹30,000 in November 2024: OnePlus Nord 4, Motorola Edge 50 Pro & More 5 Best 5G Mobiles Under ₹10,000 in November 2024: Redmi 13C 5G, Realme C6 and More Top 5 Budget-Friendly Gaming Laptops for High Performance in 2024 5 Best Camera Smartphones Under ₹20,000: OnePlus Nord CE 4 Lite, Samsung Galaxy M35 5G and More 5 Best Tablets with keyboard you can buy in November 2024