numind/NuExtract-1.5

NuExtract-1.5 by NuMind is a 4 billion parameter language model, fine-tuned from Phi-3.5-mini-instruct, specifically for structured information extraction. It excels at extracting data from long documents (up to 20k tokens) and supports multiple languages including English, French, Spanish, German, Portuguese, and Italian. The model prioritizes pure extraction, ensuring generated text is present in the original source, making it ideal for precise data retrieval tasks.

Warm
Public
4B
BF16
4096
License: mit
Hugging Face