Skip to Content
Release Notes2025-11-14

Released: 14 November 2025

We’re excited to announce the launch of our first dedicated OCR model, delivering significantly improved accuracy, speed, and layout understanding for image-based text extraction. This release expands our multimodal AI capabilities and strengthens support for enterprise-scale document processing.

What’s New?

1. Llama-4-Maverick-17B-128E-Instruct (OCR)

A high-precision Optical Character Recognition model designed for reliability across diverse real-world input conditions.

Key features:

  • High-accuracy text extraction from documents, photos, screenshots, and natural scenes
  • Robust layout understanding, including tables, multi-column layouts, and irregular formatting
  • Mixed-language recognition with improved multilingual performance
  • Noise-tolerant inference for low-resolution, blurred, or compressed images
  • Optimized throughput and latency for large-scale production workloads
  • Easy API integration with existing applications and pipelines

Performance Enhancements

This model introduces:

  • Faster inference time for high-volume OCR jobs
  • Improved reconstruction of structured and semi-structured documents
  • Higher accuracy across multilingual and complex-layout documents
  • Enhanced resilience to visual noise and distortion

Next Steps for You

  1. Start using the OCR model through the standard API endpoint.
  2. Evaluate improvements by testing your document and image samples.
  3. Integrate the model into automated pipelines for data extraction.
  4. Reach out to support if you need help with onboarding or migration.

We’re excited to introduce Llama-4-Maverick-17B-128E-Instruct (OCR) and look forward to supporting your adoption.

Last updated on