jq/Qwen-14B-pretrain-including-parallel-text-extended

The jq/Qwen-14B-pretrain-including-parallel-text-extended is a 14 billion parameter language model with a 32768 token context length. This model is a pre-trained variant of the Qwen architecture, developed by Qwen. Its specific differentiators and primary use cases are not detailed in the provided model card, which indicates that more information is needed regarding its development and capabilities.

Cold
Public
14B
FP8
32768
Hugging Face