tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3

Llama-3.1-Swallow-70B-Instruct-v0.3 is a 70 billion parameter instruction-tuned large language model developed by tokyotech-llm, built upon Meta Llama 3.1. It enhances Japanese language capabilities through continual pre-training on approximately 200 billion Japanese and English tokens, while retaining strong English performance. This model is optimized for multi-turn dialogue, generating helpful and detailed responses, and excels in Japanese conversational tasks.

Warm

Public

Model Size: 70B

Quant: FP8

Ctx length: 32768

License: llama3.1

Hugging Face

No reviews yet. Be the first to review!