shitshow123/tinylamma-20000
The shitshow123/tinylamma-20000 is a 1.1 billion parameter instruction-tuned causal language model, developed by shitshow123. This model has been trained for 20,000 steps using Direct Preference Optimization (DPO) on a TinyLlama 1B base, making it suitable for tasks requiring refined instruction following. It features a context length of 2048 tokens, focusing on efficient performance for specific instruction-based applications.