cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese

cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese is a 32 billion parameter Japanese-finetuned causal language model developed by CyberAgent, based on deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. This model is optimized for Japanese language understanding and generation, leveraging a 32768 token context length for complex tasks. Its primary strength lies in providing high-quality responses in Japanese, making it suitable for applications requiring robust Japanese NLP capabilities.

Warm

Public

Model Size: 32B

Quant: FP8

Ctx length: 32768

License: mit

Hugging Face

No reviews yet. Be the first to review!