TIGER-Lab/MAmmoTH2-8B

MAmmoTH2-8B is an 8 billion parameter instruction-tuned causal language model developed by TIGER-Lab, based on the Llama-3 architecture with an 8192 token context length. It is specifically optimized for enhancing reasoning abilities, particularly in mathematical tasks, by leveraging 10 million instruction-response pairs harvested from web corpora. This model demonstrates significant performance improvements on benchmarks like MATH and GSM8K without relying on domain-specific training data.

Warm
Public
8B
FP8
8192
License: mit
Hugging Face

No reviews yet. Be the first to review!