allenai/tulu-2-dpo-13b

allenai/tulu-2-dpo-13b is a 13 billion parameter language model developed by AllenAI, fine-tuned from Llama 2 using Direct Preference Optimization (DPO). It is designed as a helpful assistant, excelling in chat-based interactions and offering a strong alternative to Llama 2 13B Chat. This model demonstrates enhanced alignment and performance on benchmarks like MT-Bench and AlpacaEval.