tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3
The Llama-3.1-Swallow-8B-Instruct-v0.3 model by tokyotech-llm is an 8 billion parameter instruction-tuned large language model, continually pre-trained from Meta's Llama 3.1. It significantly enhances Japanese language capabilities while retaining strong English performance, utilizing approximately 200 billion tokens from Japanese web corpora, Wikipedia, and technical content. This model excels in multi-turn Japanese dialogue, achieving state-of-the-art performance on Japanese MT-Bench among open-source LLMs of comparable size.
No reviews yet. Be the first to review!