tartuNLP/Llammas-base

Llammas-base is a 7 billion parameter language model developed by tartuNLP, based on the Llama-2 architecture. It underwent continued pre-training with an additional 5 billion tokens from the CulturaX dataset, comprising 75% Estonian and 25% English documents. This model is specifically designed for cross-lingual knowledge transfer, making it particularly strong in Estonian language processing while retaining English capabilities.

Warm
Public
7B
FP8
4096
License: llama2
Hugging Face