KaraKaraWitch/Matsutei-Qwen2.5-72b

KaraKaraWitch/Matsutei-Qwen2.5-72b is a 72.7 billion parameter language model based on the Qwen2.5 architecture, created by KaraKaraWitch through a TIES merge of several pre-trained models. This model is designed to address specific issues related to world book lore information injection, aiming to reduce confusion when integrating external context. With a substantial 131,072 token context length, it is optimized for handling extensive narrative and contextual data. Its primary strength lies in managing complex lore and preventing initial confusion during content generation.

Warm

Public

Model Size: 72.7B

Quant: FP8

Ctx length: 131072

Hugging Face