AI-Sweden-Models/gpt-sw3-356m
GPT-Sw3 356M is a 0.5 billion parameter decoder-only transformer language model developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320 billion tokens across Swedish, Norwegian, Danish, Icelandic, English, and programming code, it generates coherent text in five languages and four programming languages. This model is part of a collection focused on advancing large language models for Nordic languages, with a context length of 2048 tokens.
No reviews yet. Be the first to review!