Nemotron Super 3 120B A12B is NVIDIA's language model with a 256K context window, starting at $0.5 / 1M input and $1.50 / 1M output. Hybrid mixture-of-experts model with 120B total and 12B active parameters, designed for high-accuracy multi-agent and specialized agentic AI applications.
Specifications
Canonical IDnvidia-nemotron-3-120b-a12b
TypeLanguage
StatusActive
CreatorNVIDIANVIDIA
Providers
Context Window256K tokens
Input ModalitiesText
Output ModalitiesText
Reasoning Effortsdefault

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities2/13
Reasoning
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Input
$ / 1M
Output
$ / 1M
Cloudflare Workers AI logo
Cloudflare Workers AI
@cf/nvidia/nemotron-3-120b-a12b
$0.5$1.50

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Nemotron 4 15BAvailable
Nemotron 3 Ultra 550B A55B1.0M$0.500$2.40Available
Nemotron 3.5 Content Safety128KAvailable
Nemotron Nano 3 30B A3B Omni Reasoning256KAvailable
Nemotron Super 3 120B256K$0.150$0.650Available
Nemotron Nano 3 30B262K$0.060$0.240Available
Nemotron Nano 3 30B A3B ReasoningAvailable
Nemotron Nano 3 30B A3B OmniAvailable
Nemotron Nano 3 4BAvailable
Nemotron 3Available
Nemotron Super 3 120B A12B256K$0.500$1.50Current

Model IDs

@cf/nvidia/nemotron-3-120b-a12b
nvidia-nemotron-3-120b-a12b