Llama 4 Maverick 17 B 128 E Instruct FP8 Pricing & Specs | AI Models

Llama 4 Maverick 17B 128E Instruct FP8 is a text model from Azure AI with a context window of 1.0M tokens and max output of 16K tokens. Pricing starts at 1.41 per million input tokens and 0.35 per million output tokens (cheapest at Lambda).

Capabilities

✓ Vision✓ Function Calling✗ Reasoning✗ JSON Schema✗ System Messages✗ Web Search✗ Prompt Caching✗ Audio Input✗ Audio Output

Specifications

Model Key	`azure_ai/Llama-4-Maverick-17B-128E-Instruct-FP8`
Provider	Azure AI
Provider ID	azure_ai
Mode	Text
Canonical Name	llama-maverick-4-17b
Context Window	1.0M tokens
Max Output	16K tokens

Pricing

Type	Per 1K Tokens	Per 1M Tokens
Input Tokens	0.0014	1.41
Output Tokens	0.000350	0.350

Benchmarks

No benchmark data is available for this model.

Price Comparison by Provider

Compare prices for Llama 4 Maverick 17B 128E Instruct FP8 across different providers. The same model may be available through multiple providers at different price points.

Provider	Model Key	Input Price, $	Output Price, $
IBM watsonx	watsonx/meta-llama/llama-4-maverick-17b	0.350	1.40
Together AI	together_ai/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	0.270	0.850
SambaNova	sambanova/Llama-4-Maverick-17B-128E-Instruct	0.630	1.80
Oracle Cloud (OCI)	oci/meta.llama-4-maverick-17b-128e-instruct-fp8	0.720	0.720
AWS Bedrock	meta.llama4-maverick-17b-instruct-v1:0	0.240	0.970
Meta Llama	meta_llama/Llama-4-Maverick-17B-128E-Instruct-FP8	N/A	N/A
Lambda	lambda_ai/llama-4-maverick-17b-128e-instruct-fp8	0.050	0.100
Groq	groq/meta-llama/llama-4-maverick-17b-128e-instruct	0.200	0.600
Fireworks AI	fireworks_ai/accounts/fireworks/models/llama4-maverick-instruct-basic	0.220	0.880
DeepInfra	deepinfra/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	0.150	0.600
Databricks	databricks/databricks-llama-4-maverick	0.500	1.50
Azure AI	azure_ai/Llama-4-Maverick-17B-128E-Instruct-FP8	1.41	0.350

All Variants

All available versions, regions, and API endpoints for Llama 4 Maverick 17B 128E Instruct FP8.

Model Key	Provider	Mode	Input Price, $	Output Price, $	Context	Max Output	Vision	Functions
meta.llama4-maverick-17b-instruct-v1:0	AWS Bedrock	Text	0.240	0.970	128K	4K	no	yes
us.meta.llama4-maverick-17b-instruct-v1:0	AWS Bedrock	Text	0.240	0.970	128K	4K	no	yes
azure_ai/Llama-4-Maverick-17B-128E-Instruct-FP8	Azure AI	Text	1.41	0.350	1.0M	16K	yes	yes
databricks/databricks-llama-4-maverick	Databricks	Text	0.500	1.50	128K	128K	no	no
deepinfra/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	DeepInfra	Text	0.150	0.600	1.0M	1.0M	no	yes
fireworks_ai/accounts/fireworks/models/llama4-maverick-instruct-basic	Fireworks AI	Text	0.220	0.880	131K	131K	no	no
groq/meta-llama/llama-4-maverick-17b-128e-instruct	Groq	Text	0.200	0.600	131K	8K	yes	yes
watsonx/meta-llama/llama-4-maverick-17b	IBM watsonx	Text	0.350	1.40	128K	128K	no	yes
lambda_ai/llama-4-maverick-17b-128e-instruct-fp8	Lambda	Text	0.050	0.100	131K	8K	no	yes
meta_llama/Llama-4-Maverick-17B-128E-Instruct-FP8	Meta Llama	Text	N/A	N/A	1.0M	4K	no	yes
oci/meta.llama-4-maverick-17b-128e-instruct-fp8	Oracle Cloud (OCI)	Text	0.720	0.720	512K	4K	no	yes
sambanova/Llama-4-Maverick-17B-128E-Instruct	SambaNova	Text	0.630	1.80	131K	131K	yes	yes
together_ai/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8	Together AI	Text	0.270	0.850	N/A	N/A	no	yes

← Back to All Models