MetaStoneTec/XBai-o4
MetaStoneTec/XBai-o4 is a 32 billion parameter large language model developed by MetaStone-AI, representing their fourth-generation open-source technology. It is trained using a novel reflective generative form combining Long-CoT Reinforcement Learning and Process Reward Learning, enabling deep reasoning and high-quality trajectory selection. This architecture significantly reduces inference costs while excelling in complex reasoning tasks, surpassing OpenAI-o3-mini in Medium mode.
No reviews yet. Be the first to review!