NVIDIA logo

Llama 3.3 Nemotron 1.5 Super 49B

Llama 3.3 Nemotron 1.5 Super 49B is NVIDIA's language model with a 131K context window and up to 16K output tokens, available from 2 providers, starting at $0.100 / 1M input and $0.400 / 1M output. A 49B-parameter LLM derived from Llama 3.3 70B, optimized by NVIDIA for reasoning, RAG, and tool-calling with a compute-efficient Super architecture at version 1.5.
Specifications
Canonical IDnvidia-llama-3-3-nemotron-1-5-super-49b
TypeLanguage
StatusActive
CreatorNVIDIANVIDIA
Providers
Context Window131K tokens
Max Output16K tokens
Input ModalitiesText
Output ModalitiesText
Reasoning Effortsdefault
Parameters49B
Release Date · 7 months ago
Knowledge Cutoff

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities3/13
Reasoning
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Output
$ / 1M
DeepInfra logo
DeepInfra
deepinfra/nvidia/Llama-3.3-Nemotron-Super-49B-v1.5
$0.100$0.400
OpenRouter logo
OpenRouter
nvidia/llama-3.3-nemotron-super-49b-v1.5
$0.100$0.400

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.3 Nemotron 1.5 Super 49B131K$0.100$0.400Current
Llama Nemotron 1.5 Super 49B ReasoningAvailable
Llama Nemotron 1.5 Super 49BAvailable
Llama 3.3 Nemotron Super 49B ReasoningAvailable
Llama 3.3 Nemotron Super 49BAvailable

Other models

ModelTierReleasedContextInput / 1MOutput / 1M
Llama 3.3 70B Instruct131K$0.100$0.200
Llama 3.2 3B Instruct131K$0.015$0.020
Llama 3.2 1B Instruct131K$0.027$0.080
Llama 3.1 405B Instruct131K$0.120$0.300
Llama 3.1 70B Instruct131K$0.100$0.100
Llama 3.1 8B Instruct200K$0.020$0.030
Llama 3.1 70B128K$0.600$0.600
Llama 3.1 8B131K$0.030$0.050
Llama 3 70B Instruct131K$0.120$0.300
Llama 3 8B Instruct32K$0.030$0.040

Model IDs