jpacifico/Chocolatine-3B-Instruct-DPO-v1.2

Chocolatine-3B-Instruct-DPO-v1.2 is a 3.82 billion parameter instruction-tuned causal language model developed by jpacifico, based on Microsoft's Phi-3.5-mini-instruct architecture. Fine-tuned using DPO on a French RLHF dataset, it supports a 128K context length and demonstrates strong performance in French, even outperforming its base model in English. This model excels in French language tasks, making it suitable for applications requiring robust French language understanding and generation.

Warm
Public
4B
BF16
4096
License: mit
Hugging Face