Llama3.1 405B Instruct Fp8

Llama3.1 405B Instruct Fp8 is a text model from Lambda with a context window of 131K tokens and max output of 131K tokens. Pricing starts at 0.80 per million input tokens and 0.80 per million output tokens (cheapest at Fireworks AI).

Capabilities

Vision Function Calling Reasoning JSON Schema System Messages Web Search Prompt Caching Audio Input Audio Output

Specifications

Model Keylambda_ai/llama3.1-405b-instruct-fp8
ProviderLambda
Provider IDlambda_ai
ModeText
Canonical Namellama-3.1-405b
Context Window131K tokens
Max Output131K tokens

Pricing

TypePer 1K TokensPer 1M Tokens
Input Tokens0.0008000.800
Output Tokens0.0008000.800

Benchmarks

Intelligence Index17.4#107
Coding Index14.5#100
Math Index3.0#134
MMLU-Pro0.7#94
GPQA0.5#129
HLE0.0#161
LiveCodeBench0.3#107
AIME0.2#63
IFBench0.4#87
Time to First Token0.52s#132
SciCode0.3#100
MATH-5000.7#93
AIME 20250.0#134
LCR0.2#93
TerminalBench Hard0.1#87
TAU20.2#126

Price Comparison by Provider

Compare prices for Llama3.1 405B Instruct Fp8 across different providers. The same model may be available through multiple providers at different price points.

Provider
Model Key
Input Price, $
Output Price, $
Vertex AI (Llama)vertex_ai/meta/llama-3.1-405b-instruct-maas5.0016.00
Together AItogether_ai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo3.503.50
Snowflakesnowflake/snowflake-llama-3.1-405bN/AN/A
SambaNovasambanova/Meta-Llama-3.1-405B-Instruct5.0010.00
Oracle Cloud (OCI)oci/meta.llama-3.1-405b-instruct10.6810.68
Nebiusnebius/meta-llama/Meta-Llama-3.1-405B-Instruct1.003.00
AWS Bedrockmeta.llama3-1-405b-instruct-v1:05.3216.00
Lambdalambda_ai/llama3.1-405b-instruct-fp80.8000.800
Hyperbolichyperbolic/meta-llama/Meta-Llama-3.1-405B-Instruct0.1200.300
Fireworks AIfireworks_ai/accounts/fireworks/models/llama-v3p1-405b-instruct-long0.1000.100
Databricksdatabricks/databricks-meta-llama-3-1-405b-instruct5.0015.00
Azure AIazure_ai/Meta-Llama-3.1-405B-Instruct5.3316.00

All Variants

All available versions, regions, and API endpoints for Llama3.1 405B Instruct Fp8.

Model Key
Provider
Mode
Input Price, $
Output Price, $
Context
Max Output
Vision
Functions
meta.llama3-1-405b-instruct-v1:0AWS BedrockText5.3216.00128K4Knoyes
us.meta.llama3-1-405b-instruct-v1:0AWS BedrockText5.3216.00128K4Knoyes
azure_ai/Meta-Llama-3.1-405B-InstructAzure AIText5.3316.00128K2Knono
databricks/databricks-meta-llama-3-1-405b-instructDatabricksText5.0015.00128K128Knono
fireworks_ai/accounts/fireworks/models/llama-v3p1-405b-instructFireworks AIText3.003.00128K16Knoyes
fireworks_ai/accounts/fireworks/models/llama-v3p1-405b-instruct-longFireworks AIText0.1000.1004K4Knono
hyperbolic/meta-llama/Meta-Llama-3.1-405B-InstructHyperbolicText0.1200.30033K33Knoyes
lambda_ai/llama3.1-405b-instruct-fp8LambdaText0.8000.800131K131Knoyes
nebius/meta-llama/Meta-Llama-3.1-405B-InstructNebiusText1.003.00128K128Knoyes
oci/meta.llama-3.1-405b-instructOracle Cloud (OCI)Text10.6810.68128K4Knoyes
sambanova/Meta-Llama-3.1-405B-InstructSambaNovaText5.0010.0016K16Knoyes
snowflake/llama3.1-405bSnowflakeTextN/AN/A128K8Knono
snowflake/snowflake-llama-3.1-405bSnowflakeTextN/AN/A8K8Knono
together_ai/meta-llama/Meta-Llama-3.1-405B-Instruct-TurboTogether AIText3.503.50N/AN/Anoyes
vertex_ai/meta/llama-3.1-405b-instruct-maasVertex AI (Llama)Text5.0016.00128K2Kyesno