Medium is the new large.

Mistral Medium 3 delivers state-of-the-art performance at 8X lower cost with radically simplified enterprise deployments.

Research
May 7, 2025Mistral AI

At Mistral AI, we are continuously pushing the frontier for both open models (Mistral Small, Mistral Large, Pixtral, many others) as well as enterprise models (Mistral OCR, Mistral Saba, Ministral 3B / 8B, and more). All the way from Mistral 7B, our models have consistently demonstrated performance of significantly higher-weight and more expensive models. And today, we are excited to announce Mistral Medium 3, pushing efficiency and usability of language models even further.

Highlights.

  1. Mistral Medium 3 introduces a new class of models that balances

    • SOTA performance

    • 8X lower cost

    • simpler deployability to accelerate enterprise usage

  2. The model leads in professional use cases such as coding and multimodal understanding

  3. The model delivers a range of enterprise capabilities including:

    • Hybrid or on-premises / in-VPC deployment

    • Custom post-training 

    • Integration into enterprise tools and systems

The perfect balance.

Mistral Medium 3 delivers frontier performance while being an order of magnitude less expensive. For instance, the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks across the board at a significantly lower cost ($0.4 input / $2 output per M token). 

On performance, Mistral Medium 3 also surpasses leading open models such as Llama 4 Maverick and enterprise models such as Cohere Command A. On pricing, the model beats cost leaders such as DeepSeek v3, both in API and self-deployed systems. 

Additionally, Mistral Medium 3 can also be deployed on any cloud, including self-hosted environments of four GPUs and above. 

Top-tier performance.

Mistral Medium 3 is designed to be frontier-class, particularly in categories of professional use. In the evaluations below, we use numbers reported previously by other providers wherever available, otherwise we use our own evaluation harness. Performance accuracy on all benchmarks were obtained through the same internal evaluation pipeline. Mistral Medium 3 particular stands out in coding and STEM tasks where it comes close to its very large and much slower competitors. 

Table Medium 6

*Performance accuracy on all benchmarks were obtained through the same internal evaluation pipeline.

Human Evals

In addition to academic benchmarks we report third-party human evaluations that are more representative of real-world use cases. Mistral Medium 3 continues to shine in the coding domain and delivers much better performance, across the board, than some of its much larger competitors.

Side by Side Surge Human Evals Coding

Side by Side Surge Human Evals (1)

Built for enterprise use cases.

Mistral Medium 3 stands out from other SOTA models in its ability to adapt to enterprise contexts. In a world where organizations are forced to choose between fine-tuning over API or self-deploying and customizing model behaviour from scratch, Mistral Medium 3 offers a path to comprehensively integrate intelligence into enterprise systems. With the help of Mistral’s applied AI solutions, the model can be continuously pretrained, fully fine tuned, and blended into enterprise knowledge bases, making it a high-fidelity solution for domain-specific training, continuous learning, and adaptive workflows. Beta customers across financial services, energy, and healthcare are using the model to enrich customer service with deep context, personalize business processes, and analyze complex datasets. 

Available today. 

The Mistral Medium 3 API is available starting today on Mistral La Plateforme and Amazon Sagemaker, and soon on IBM WatsonX, NVIDIA NIM, Azure AI Foundry, and Google Cloud Vertex. To deploy and customize the model in your environment, please contact us

One more thing…

With the launches of Mistral Small in March and Mistral Medium today, it’s no secret that we’re working on something ‘large’ over the next few weeks. With even our medium-sized model being resoundingly better than flagship open source models such as Llama 4 Maverick, we’re excited to ‘open’ up what’s to come :)