Llama 4 Maverick 17B 128E Instruct FP8
DeepinfraText
Llama 4 Maverick 17B 128E Instruct FP8 is a text model from Deepinfra with a context window of 1.0M tokens and max output of 1.0M tokens. Pricing starts at $0.15 per million input tokens and $0.60 per million output tokens (cheapest at Deepinfra).
Specifications
| Model Key | deepinfra/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 |
| Provider | Deepinfra |
| LiteLLM Provider | deepinfra |
| Mode | Text |
| Canonical Name | llama-maverick-4-17b-128e |
| Context Window | 1.0M tokens |
| Max Output | 1.0M tokens |
Capabilities
✗ Vision✗ Function Calling✗ Reasoning✗ JSON Schema✗ System Messages✗ Web Search✗ Prompt Caching✗ Audio Input✗ Audio Output
Pricing
| Type | Per 1K Tokens | Per 1M Tokens |
|---|---|---|
| Input Tokens | $0.000150 | $0.150 |
| Output Tokens | $0.000600 | $0.600 |
Price Comparison by Provider
Compare prices for Llama 4 Maverick 17B 128E Instruct FP8 across different providers. The same model may be available through multiple providers at different price points.
Provider | Model Key | Input Price | Output Price |
|---|---|---|---|
| Groq | groq/meta-llama/llama-4-maverick-17b-128e-instruct | $0.200 | $0.600 |
| Deepinfra | deepinfra/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | $0.150 | $0.600 |
| Together AI | together_ai/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | $0.270 | $0.850 |
| Google Vertex AI | vertex_ai/meta/llama-4-maverick-17b-128e-instruct-maas | $0.350 | $1.15 |
Similar Models
Models with similar capabilities and context window size.
Model | Provider | Mode | Input Price | Output Price | Context | Max Output | Vision | Functions |
|---|---|---|---|---|---|---|---|---|
| Gemini 1.5 Flash-8b | Google Gemini | Text | N/A | N/A | 1.0M | 8K | yes | yes |
| Gemini 1.5 Flash-8b-exp-0924 | Google Gemini | Text | N/A | N/A | 1.0M | 8K | yes | yes |
| Gemini 1.5 Flash-exp-0827 | Google Gemini | Text | N/A | N/A | 1.0M | 8K | yes | yes |
| Gemini 2.0 Flash-exp | Google Gemini | Text | N/A | N/A | 1.0M | 8K | yes | yes |
| Gemini 2.0 Flash-thinking-exp | Google Vertex AI | Text | N/A | N/A | 1.0M | 8K | yes | yes |
| Gemini 2.0 Flash-thinking-exp | Google Gemini | Text | N/A | N/A | 1.0M | 66K | yes | yes |
| Gemini 2.0 Flash-thinking-exp-01-21 | Google Vertex AI | Text | N/A | N/A | 1.0M | 66K | yes | no |
| Gemini 2.0 Flash-thinking-exp-01-21 | Google Gemini | Text | N/A | N/A | 1.0M | 66K | yes | yes |
| Gemini 2.5 Pro-exp-03-25 | Google Gemini | Text | N/A | N/A | 1.0M | 66K | yes | yes |
| Gemini Experimental 1114 | Google Gemini | Text | N/A | N/A | 1.0M | 8K | yes | yes |