Name: Llama 3.1 Nemotron 1 Ultra 253B
Brand: NVIDIA

Llama 3.1 Nemotron 1 Ultra 253B is NVIDIA's language model with a 128K context window, starting at $0.600 / 1M input and $1.80 / 1M output. A 253B-parameter ultra-scale LLM fine-tuned by NVIDIA on Llama 3.1, optimized for advanced reasoning and high-accuracy agentic tasks.

Specifications
Canonical ID	`nvidia-llama-3-1-nemotron-1-ultra-253b`
Type	Language
Status	Active
Creator	NVIDIA
Providers	Nebius
Context Window	128K tokens
Input Modalities	Text
Output Modalities	Text
Parameters	253B

Capabilities

Input1/5

Text✓

Image·

Audio·

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities1/13

Reasoning·

Adaptive Reasoning·

Function Calling✓

Parallel Function Calling·

Structured Outputs·

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Pricing by Provider

Provider	Standard
Provider	Input $ / 1M	Output $ / 1M
Nebius nebius/nvidia/Llama-3.1-Nemotron-Ultra-253B-v1	$0.600	$1.80

Cost Calculator

Preset:

Input tokens

Output tokens

Number of calls

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
Llama 3.1 Nemotron 1 Ultra 253B	—	128K	$0.600	$1.80	Current
Llama 3.1 Nemotron 1 Ultra 253B Reasoning	—	—	—	—	Available

Other models

Model	Tier	Released	Context	Input / 1M	Output / 1M
Llama 3.3 70B Instruct	—	2024-12-06	131K	$0.100	$0.200
Llama 3.2 3B Instruct	—	2024-09-25	131K	$0.015	$0.020
Llama 3.2 1B Instruct	—	2024-09-25	131K	$0.027	$0.080
Llama 3.1 405B Instruct	—	2024-07-23	131K	$0.120	$0.300
Llama 3.1 70B Instruct	—	2024-07-23	131K	$0.100	$0.100
Llama 3.1 8B Instruct	—	2024-07-23	200K	$0.020	$0.030
Llama 3.1 70B	—	2024-07-23	128K	$0.600	$0.600
Llama 3.1 8B	—	2024-07-23	131K	$0.030	$0.050
Llama 3 70B Instruct	—	2024-04-18	131K	$0.120	$0.300
Llama 3 8B Instruct	—	2024-04-18	32K	$0.030	$0.040

Llama 3.1 Nemotron 1 Ultra 253B

Capabilities

Pricing by Provider

Cost Calculator

Versions

Other models

Model IDs