mlx-community/Llama-3.1-Nemotron-70B-Instruct-HF-bf16
The mlx-community/Llama-3.1-Nemotron-70B-Instruct-HF-bf16 is a 70 billion parameter instruction-tuned language model, converted to MLX format from NVIDIA's Llama-3.1-Nemotron-70B-Instruct-HF. This model leverages a 32768 token context length and is designed for general-purpose conversational AI and instruction following tasks. Its primary strength lies in its large parameter count and instruction-following capabilities, making it suitable for complex natural language understanding and generation.