Grok Vision Beta is
xAI's language model with a 8K context window and up to 8K output tokens, starting at $5.00 / 1M input and $15.00 / 1M output. An early beta multimodal variant of Grok with image understanding capabilities for vision-language tasks.
Capabilities
Input1/5
·
✓
·
·
·
Output1/5
✓
·
·
·
·
Capabilities2/13
·
·
✓
·
·
·
✓
·
·
·
·
·
·
Pricing by Provider
| Provider | Standard | |
|---|---|---|
| Input $ / 1M | Output $ / 1M | |
xAI | $5.00 | $15.00 |
Cost Calculator
Preset:
Compares every provider & tier in USD
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| Grok 4.20 Multi-Agent | 2.0M | $2.00 | $6.00 | Available | |
| Grok 4.20 Multi Agent Beta | 2.0M | $2.00 | $6.00 | Available | |
| Grok 4 20 Non-Reasoning | 2.0M | — | — | Available | |
| Grok 4 20 Reasoning | 2.0M | — | — | Available | |
| Grok 4 20 | — | 131K | $3.00 | $15.00 | Available |
| Grok 4.1 Fast | 2.0M | $0.200 | $0.500 | Available | |
| Grok 4.1 Fast Non-Reasoning | 2.0M | $0.200 | $0.500 | Available | |
| Grok 4.1 Fast Reasoning | 2.0M | $0.200 | $0.500 | Available | |
| Grok 4 Non-Reasoning | 2.0M | $2.00 | $6.00 | Available | |
| Grok 4 Reasoning | 2.0M | $2.00 | $6.00 | Available | |
| Grok Vision Beta | — | 8K | $5.00 | $15.00 | Current |