aws-prototyping/MegaBeam-Mistral-7B-300k

MegaBeam-Mistral-7B-300k is a 7 billion parameter language model developed by aws-prototyping, fine-tuned from Mistral-7B-Instruct-v0.2. This model is specifically engineered to support exceptionally long input contexts, up to 320,000 tokens, significantly extending the context window of its base model. It excels in long-context understanding and retrieval tasks, making it suitable for applications requiring processing of extensive documents or conversations.

Warm

Public

Model Size: 7B

Quant: FP8

Ctx length: 8192

License: apache-2.0

Hugging Face