haoranxu/ALMA-13B-R

ALMA-13B-R is a 13 billion parameter language model developed by Haoran Xu and his team, specifically fine-tuned for machine translation. It utilizes Contrastive Preference Optimization (CPO) on top of the ALMA architecture, enabling it to achieve high-quality translation performance. This model excels at translating between languages, with reported capabilities matching or exceeding GPT-4 and WMT winners in translation tasks. Its 4096-token context length supports robust handling of translation inputs.

Warm

Public

Model Size: 13B

Quant: FP8

Ctx length: 4096

License: mit

Hugging Face

No reviews yet. Be the first to review!