Released: 14 November 2025
We’re excited to announce the launch of our first dedicated OCR model, delivering significantly improved accuracy, speed, and layout understanding for image-based text extraction. This release expands our multimodal AI capabilities and strengthens support for enterprise-scale document processing.
What’s New?
1. Llama-4-Maverick-17B-128E-Instruct (OCR)
A high-precision Optical Character Recognition model designed for reliability across diverse real-world input conditions.
Key features:
- High-accuracy text extraction from documents, photos, screenshots, and natural scenes
- Robust layout understanding, including tables, multi-column layouts, and irregular formatting
- Mixed-language recognition with improved multilingual performance
- Noise-tolerant inference for low-resolution, blurred, or compressed images
- Optimized throughput and latency for large-scale production workloads
- Easy API integration with existing applications and pipelines
Performance Enhancements
This model introduces:
- Faster inference time for high-volume OCR jobs
- Improved reconstruction of structured and semi-structured documents
- Higher accuracy across multilingual and complex-layout documents
- Enhanced resilience to visual noise and distortion
Next Steps for You
- Start using the OCR model through the standard API endpoint.
- Evaluate improvements by testing your document and image samples.
- Integrate the model into automated pipelines for data extraction.
- Reach out to support if you need help with onboarding or migration.
We’re excited to introduce Llama-4-Maverick-17B-128E-Instruct (OCR) and look forward to supporting your adoption.
Last updated on