haoranxu/ALMA-13B-R

ALMA-13B-R is a 13 billion parameter language model developed by Haoran Xu and his team, specifically fine-tuned for machine translation. It utilizes Contrastive Preference Optimization (CPO) on top of the ALMA architecture, enabling it to achieve high-quality translation performance. This model excels at translating between languages, with reported capabilities matching or exceeding GPT-4 and WMT winners in translation tasks. Its 4096-token context length supports robust handling of translation inputs.

Warm
Public
13B
FP8
4096
License: mit
Hugging Face