tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1

The tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1 is an 8 billion parameter instruction-tuned causal language model developed by tokyotech-llm, built upon the Meta Llama 3 family. It features continual pre-training with a primary focus on Japanese language data, enhancing its performance in Japanese tasks. This model is optimized for instruction-following in both Japanese and English, making it suitable for bilingual applications requiring nuanced understanding and generation.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 8192

License: llama3

Hugging Face