Nvidia has introduced the Nemotron 3 range of open models, datasets and libraries, designed to support transparent, efficient and specialised agentic AI systems across sectors.
The Nemotron 3 models come in three sizes: Nano, Super and Ultra. They use a hybrid latent mixture-of-experts architecture to help developers create and run multi-agent AI systems at scale.
Access deeper industry intelligence
Experience unmatched clarity with a single platform that combines unique data, AI, and human expertise.
The company positions Nemotron 3 for organisations moving from single-model chatbots to multi-agent systems, where developers face issues such as communication overhead, context drift, high inference costs and the need for transparency when automating complex workflows.
Nvidia links Nemotron 3 to its wider sovereign AI efforts, stating that organisations in regions from Europe to South Korea are adopting open, transparent and efficient models that enable them to build AI aligned with their own data, regulations and values.
Nvidia founder and CEO Jensen Huang commented: “Open innovation is the foundation of AI progress.
“With Nemotron, we’re transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”
US Tariffs are shifting - will you react or anticipate?
Don’t let policy changes catch you off guard. Stay proactive with real-time data and expert analysis.
By GlobalDataOpen Nemotron 3 models give startups a way to build and iterate more quickly on AI agents and move from prototype to enterprise rollout.
The Nemotron 3 family consists of three MoE models. Nemotron 3 Nano is described as a small 30-billion-parameter model that can activate up to three billion parameters at a time for targeted tasks.
Nemotron 3 Super has 100 billion parameters, with up to 10 billion active per token, and is aimed at multi-agent applications that rely on reasoning.
Nemotron 3 Ultra has 500 billion parameters, with up to 50 billion active per token, and is positioned as a reasoning engine for complex AI applications. It is optimised for uses such as software debugging, content summarisation, AI assistant workflows and information retrieval at low inference cost.
The model’s hybrid MoE architecture improves efficiency and scaleability. It delivers up to four times higher token throughput than Nemotron 2 Nano and cuts reasoning-token generation by up to 60%, lowering inference costs.
Nemotron 3 Nano has a 1-million-token context window, which Nvidia says allows it to retain more information and connect data over long, multistep tasks.
Nemotron 3 Super is suited to applications in which many agents work together on complex tasks with low latency, while Nemotron 3 Ultra is aimed at AI workflows that need deep research and strategic planning.
Both Super and Ultra use Nvidia’s 4-bit NVFP4 training format on the Nvidia Blackwell architecture, which the company says cuts memory requirements and speeds up training.
This format allows larger models to be trained on existing infrastructure “without compromising accuracy relative to higher-precision formats.”
Nvidia states that the three-model lineup allows developers to select open models matched to specific workloads and to scale from dozens to hundreds of agents, while gaining faster, more accurate long-horizon reasoning for complex workflows.
Alongside the models, Nvidia has released training datasets and reinforcement learning libraries for building specialised AI agents.
The company has provided three trillion tokens of Nemotron pretraining, post-training and reinforcement learning datasets, which include reasoning, coding and multistep workflow examples to support the creation of domain-specialised agents.
The Nemotron Agentic Safety Dataset supplies real-world telemetry to help teams evaluate and improve the safety of complex agent systems.
To support development, Nvidia has released the NeMo Gym and NeMo RL open-source libraries, which offer training environments and post-training foundations for Nemotron models, along with NeMo Evaluator to check model safety and performance.
All these tools and datasets are now available on GitHub and Hugging Face.
Nemotron 3 is supported by LM Studio, llama.cpp, SGLang and vLLM. Prime Intellect and Unsloth are integrating NeMo Gym’s training environments into their own workflows, providing teams with faster access to reinforcement learning training.
Nemotron 3 Nano is currently available on Hugging Face and through inference providers Baseten, DeepInfra, Fireworks, FriendliAI, OpenRouter and Together AI.
Nemotron is also offered on enterprise AI and data infrastructure platforms Couchbase, DataRobot, H2O.ai, JFrog, Lambda and UiPath.
For customers using public cloud platforms, Nemotron 3 Nano will be available on AWS via Amazon Bedrock (serverless) and will be supported on CoreWeave, Crusoe, Google Cloud, Microsoft Foundry, Nebius, Nscale and Yotta. Nemotron 3 Nano is also offered as an Nvidia NIM microservice for deployment on Nvidia-accelerated infrastructure.
Nvidia is expected to make Nemotron 3 Super and Ultra available in the first half of 2026.
INVIDIA also separately announced that it has acquired SchedMD, the developer of Slurm, an open-source workload management system used in high-performance computing and AI. This acquisition supports the open-source software ecosystem and advances AI-related work for researchers, developers and enterprises.
Nvidia plans to continue developing and distributing Slurm as open-source, vendor-neutral software, and to keep it broadly available and supported across a range of hardware and software environments.
