tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1

The tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1 is an 8 billion parameter instruction-tuned causal language model developed by tokyotech-llm, built upon the Meta Llama 3 family. It features continual pre-training with a primary focus on Japanese language data, enhancing its performance in Japanese tasks. This model is optimized for instruction-following in both Japanese and English, making it suitable for bilingual applications requiring nuanced understanding and generation.

Warm
Public
8B
FP8
8192
License: llama3
Hugging Face

No reviews yet. Be the first to review!