unsloth/Qwen3-0.6B

The Qwen3-0.6B model, developed by Qwen, is a 0.6 billion parameter causal language model from the Qwen3 series, featuring a unique capability to seamlessly switch between a 'thinking mode' for complex reasoning and a 'non-thinking mode' for general dialogue. This model is designed for enhanced reasoning, instruction-following, and agent capabilities, supporting over 100 languages with a context length of 32,768 tokens. It excels in tasks requiring logical reasoning, mathematics, code generation, and multilingual instruction following.

Cold
Public
0.8B
BF16
40960
Hugging Face

No reviews yet. Be the first to review!