logicker/SkkuDS-DPO-72B-v1
logicker/SkkuDS-DPO-72B-v1 is a 72.3 billion parameter language model based on the Qwen1.5 architecture, fine-tuned using Direct Preference Optimization (DPO) on the Intel/orca_dpo_pairs dataset. This model offers stable support for a 32K token context length and includes multilingual capabilities. It is designed for advanced natural language understanding and generation tasks, leveraging its DPO tuning for improved alignment with human preferences.