cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese

The cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese is a 14 billion parameter language model, fine-tuned for Japanese language tasks. It is based on the DeepSeek-R1-Distill-Qwen-14B architecture, leveraging its reasoning capabilities. This model is specifically optimized for generating Japanese text and understanding Japanese queries, making it suitable for applications requiring high-quality Japanese language processing.

Warm
Public
14B
FP8
32768
License: mit
Hugging Face