AnatoliiPotapov/T-lite-0.1

AnatoliiPotapov/T-lite-0.1 is an 8 billion parameter continual pretraining model specifically designed for the Russian language, developed by AnatoliiPotapov. It features a decoder architecture with RMSNorm, SwiGLU, RoPE, and grouped query attention, trained in bf16. This model excels in Russian text generation and provides domain-specific and cultural knowledge relevant to the Russian context, outperforming Llama-3-8b on the MERA Russian benchmark with a total score of 0.492. It is intended for further fine-tuning to create Russian language applications.

Warm
Public
8B
FP8
8192
Hugging Face