Nvidia and Mistral AI have formed a partnership to introduce the Mistral 3 range of open-source multilingual and multimodal models, which have been optimised for use on the former’s supercomputing and edge platforms.
The new Mistral Large 3 model is based on a mixture-of-experts (MoE) architecture, focusing computational resources only on the areas of the model with the greatest effect.
Access deeper industry intelligence
Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.
It features 41 billion active parameters, a total of 675 billion parameters, and a context window of 256,000 tokens.
According to Nvidia, integrating its GB200 NVL72 systems with Mistral AI’s MoE architecture enables enterprises to efficiently deploy and scale large AI models while leveraging advanced parallelism and hardware-level optimisations.
The model will be available from Tuesday, 2 December 2025.
Performance testing showed that Mistral Large 3 delivered a tenfold increase on the GB200 NVL72 system compared to the previous generation Nvidia H200.
US Tariffs are shifting - will you react or anticipate?
Don’t let policy changes catch you off guard. Stay proactive with real-time data and expert analysis.
By GlobalDataThe improvements have been attributed to parallelism optimisations and support for low-precision operations such as NVFP4 and disaggregated inference methods provided by Nvidia Dynamo.
Mistral AI has also released nine smaller language models in the Ministral 3 suite.
These models are designed to run on Nvidia’s edge devices, including Spark, RTX PCs and laptops, and Jetson devices. Developers can access these models through AI frameworks such as Llama.cpp and Ollama.
The Mistral 3 family is openly available, allowing researchers and developers to experiment with and adapt the models as needed.
By using Nvidia’s NeMo open-source tools, including Data Designer, Customizer, Guardrails, and NeMo Agent Toolkit, enterprises may further tailor these models for their requirements.
Nvidia has also streamlined inference frameworks such as TensorRT-LLM, vLLM, and SGLang for the Mistral 3 models to improve performance from cloud to edge.
The models can be accessed via major open-source platforms and cloud providers now, with further deployment as Nvidia NIM microservices planned.
Recently, Nvidia invested $2bn in Synopsys as part of an expanded strategic partnership aimed at leveraging AI and accelerated computing for engineering platforms used across various industries.
