Llama 3.1 405B Instruct is Meta's language model with a 131K context window and up to 16K output tokens, available from 11 providers, starting at $0.120 / 1M input and $0.300 / 1M output. Meta's 405B instruction-tuned LLM optimized for following complex instructions, with FP8 quantization for efficient large-scale inference.
Specifications
Canonical IDmeta-llama-3-1-405b-instruct
TypeLanguage
StatusDeprecating
CreatorMetaMeta
Providers
Context Window131K tokens
Max Output16K tokens
Input ModalitiesImageText
Output ModalitiesText
Parameters405B
Release Date · 2 years ago
Deprecation Date
Benchmarks
Intelligence Index
17.4
#255
Coding Index
14.5
#229
Math Index
3.0
#249
MMLU-Pro
0.7
#181
GPQA
0.5
#308
HLE
0.0
#371
LiveCodeBench
0.3
#204
AIME
0.2
#92
IFBench
0.4
#234
Time to First Token
0.67s
#286
SciCode
0.3
#224
MATH-500
0.7
#126
AIME 2025
0.0
#249
LCR
0.2
#226
TerminalBench Hard
0.1
#226
TAU2
0.2
#312
Output TPS
58.1
#194

Capabilities

Input2/5
Text
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities3/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling
Structured Outputs
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.3 70B Instruct131K$0.100$0.200Available
Llama 3.2 3B Instruct131K$0.015$0.020Deprecated
Llama 3.2 1B Instruct131K$0.027$0.080Deprecated
Llama 3.1 405B Instruct131K$0.120$0.300Current
Llama 3.1 70B Instruct131K$0.100$0.100Available
Llama 3.1 8B Instruct200K$0.020$0.030Available
Llama 3.1 70B128K$0.600$0.600Available
Llama 3.1 8B131K$0.030$0.050Available
Llama 3 70B Instruct131K$0.120$0.300Available
Llama 3 8B Instruct32K$0.030$0.040Available
Llama 3.1 Tulu3 405BAvailable

Model IDs