BytedTsinghua-SIA/DAPO-Qwen-32B

DAPO-Qwen-32B is a 32.8 billion parameter language model developed by BytedTsinghua-SIA, based on the Qwen2.5-32B architecture. It is trained using the DAPO (Deep Alignment with Preference Optimization) algorithm, specializing in mathematical problem-solving. With a context length of 131072 tokens, this model is optimized for complex reasoning tasks requiring step-by-step mathematical solutions.

Warm

Public

Model Size: 32.8B

Quant: FP8

Ctx length: 131072

License: apache-2.0

Hugging Face