Llama 4 Maverick 17B 128E Instruct FP8

Azure AIText

Llama 4 Maverick 17B 128E Instruct FP8 is a text model from Azure AI with a context window of 1.0M tokens and max output of 16K tokens. Pricing starts at $1.41 per million input tokens and $0.35 per million output tokens (cheapest at AWS Bedrock).

Specifications

Model Keyazure_ai/Llama-4-Maverick-17B-128E-Instruct-FP8
ProviderAzure AI
LiteLLM Providerazure_ai
ModeText
Canonical Namellama-maverick-4-17b
Context Window1.0M tokens
Max Output16K tokens

Capabilities

Vision Function Calling Reasoning JSON Schema System Messages Web Search Prompt Caching Audio Input Audio Output

Pricing

TypePer 1K TokensPer 1M Tokens
Input Tokens$0.0014$1.41
Output Tokens$0.000350$0.350

Price Comparison by Provider

Compare prices for Llama 4 Maverick 17B 128E Instruct FP8 across different providers. The same model may be available through multiple providers at different price points.

Provider
Model Key
Input Price
Output Price
Azure AIazure_ai/Llama-4-Maverick-17B-128E-Instruct-FP8$1.41$0.350
AWS Bedrockus.meta.llama4-maverick-17b-instruct-v1:0$0.240$0.970

Similar Models

Models with similar capabilities and context window size.

Model
Provider
Mode
Input Price
Output Price
Context
Max Output
Vision
Functions
Gemini 1.5 FlashGoogle Vertex AIText$0.075$0.3001.0M8Kyesyes
Gemini 1.5 Flash Preview-0514Google Vertex AIText$0.075$0.00471.0M8Kyesyes
Gemini 1.5 Flash-001Google Vertex AIText$0.075$0.3001.0M8Kyesyes
Gemini 1.5 Flash-8b-exp-0827Google GeminiTextN/AN/A1.0M8Kyesyes
Gemini 1.5 Flash-exp-0827Google Vertex AIText$0.0047$0.00471.0M8Kyesyes
Gemini flash ExperimentalGoogle Vertex AITextN/AN/A1.0M8Knono
Gemini pro ExperimentalGoogle Vertex AITextN/AN/A1.0M8Knono
Qwen TurboDashscopeText$0.050$0.2001.0M8Knoyes
Qwen TurboDashscopeText$0.050$0.2001.0M16Knoyes
Qwen TurboDashscopeText$0.050$0.2001.0M16Knoyes