cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese

cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese is a 32 billion parameter Japanese-finetuned causal language model developed by CyberAgent, based on deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. This model is optimized for Japanese language understanding and generation, leveraging a 32768 token context length for complex tasks. Its primary strength lies in providing high-quality responses in Japanese, making it suitable for applications requiring robust Japanese NLP capabilities.

Warm
Public
32B
FP8
32768
License: mit
Hugging Face

No reviews yet. Be the first to review!