KaraKaraWitch/Matsutei-Qwen2.5-72b

KaraKaraWitch/Matsutei-Qwen2.5-72b is a 72.7 billion parameter language model based on the Qwen2.5 architecture, created by KaraKaraWitch through a TIES merge of several pre-trained models. This model is designed to address specific issues related to world book lore information injection, aiming to reduce confusion when integrating external context. With a substantial 131,072 token context length, it is optimized for handling extensive narrative and contextual data. Its primary strength lies in managing complex lore and preventing initial confusion during content generation.

Warm
Public
72.7B
FP8
131072
Hugging Face

No reviews yet. Be the first to review!