tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1

Llama-3.1-Swallow-70B-Instruct-v0.1 is a 70 billion parameter instruction-tuned language model developed by tokyotech-llm, built upon Meta's Llama 3.1 architecture with a 32768 token context length. This model specializes in enhanced Japanese language capabilities through continual pre-training on a large Japanese web corpus, while maintaining strong English performance. It is designed for instruction-following tasks, particularly excelling in Japanese language understanding and generation.

Warm

Public

Model Size: 70B

Quant: FP8

Ctx length: 32768

License: llama3.1

Hugging Face

No reviews yet. Be the first to review!