DeepAuto-AI/ldm_soup_Llama-3.1-8B-Inst

DeepAuto-AI/ldm_soup_Llama-3.1-8B-Inst is an 8 billion parameter language model developed by deepAuto.ai, building upon the Llama-3.1-SauerkrautLM-8B-Instruct base. It utilizes a latent diffusion model trained on pretrained weights and a model-soup averaging technique to optimize performance on Winogrande and ARC-Challenge datasets. This approach enables improved performance on unseen leaderboard tasks without additional task-specific training, offering a context length of 32768 tokens.

Warm

Public

Model Size: 8B

Quant: FP8

Ctx length: 32768

License: apache-2.0

Hugging Face