TIGER-Lab/MAmmoTH2-8B

MAmmoTH2-8B is an 8 billion parameter instruction-tuned causal language model developed by TIGER-Lab, based on the Llama-3 architecture with an 8192 token context length. It is specifically optimized for enhancing reasoning abilities, particularly in mathematical tasks, by leveraging 10 million instruction-response pairs harvested from web corpora. This model demonstrates significant performance improvements on benchmarks like MATH and GSM8K without relying on domain-specific training data.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 8192

License: mit

Hugging Face

No reviews yet. Be the first to review!