Llama 3.1 Nemotron Ultra 253B V1

Llama 3.1 Nemotron Ultra 253B V1 is a text model from Nebius with a context window of 128K tokens and max output of 128K tokens. Pricing starts at 0.60 per million input tokens and 1.80 per million output tokens.

Capabilities

Vision Function Calling Reasoning JSON Schema System Messages Web Search Prompt Caching Audio Input Audio Output

Specifications

Model Keynebius/nvidia/Llama-3.1-Nemotron-Ultra-253B-v1
ProviderNebius
Provider IDnebius
ModeText
Canonical Namellama-nemotron-ultra-3.1-253b-1
Context Window128K tokens
Max Output128K tokens

Pricing

TypePer 1K TokensPer 1M Tokens
Input Tokens0.0006000.600
Output Tokens0.00181.80

Benchmarks

Intelligence Index15.0#130
Coding Index13.1#114
Math Index63.7#38
MMLU-Pro0.8#30
GPQA0.7#59
HLE0.1#57
LiveCodeBench0.6#40
AIME0.7#15
IFBench0.4#91
Time to First Token0.69s#143
SciCode0.3#71
MATH-5001.0#20
AIME 20250.6#38
LCR0.1#134
TerminalBench Hard0.0#124
TAU20.1#145