Groq
The Groq LPU delivers inference with the speed and cost developers need. Inference platform · OpenAI-compatible API · Fast Inference · Low Latency · Lpu · Open Weight
Intelligence vs Price
Best value among Groq models on this chart: GPT OSS 120B · GPT OSS 20B · Llama 3.1 8B Instruct. Hover any dot for full pricing, or click a creator in the legend to isolate.
Groq models
14 models, 11 with pricingModel | Creator | Input Price, $ | Output Price, $ | Context | Max Output | Inference Providers | Intelligence | Coding | |
|---|---|---|---|---|---|---|---|---|---|
| GPT OSS 120B | 0.03 | 0.15 | 131K | 131K | compare (23) | 23.8#1 | 30.4#1 | ||
| GPT OSS 20B | 0.0145 | 0.07 | 131K | 131K | compare (18) | 14.9#2 | 20.7#2 | ||
| Llama 3.3 70B Instruct | 0.1 | 0.2 | 131K | 120K | compare (21) | 8.6#3 | 11.9#3 | ||
| Llama 3.1 8B Instruct | 0.02 | 0.03 | 200K | 128K | compare (21) | 7.6#4 | 5.4#4 | ||
| Gemma 7B IT | 0.05 | 0.08 | 8K | 8K | compare (3) | N/A | N/A | ||
| GPT OSS 20B Safeguard | 0.07 | 0.2 | 131K | 66K | compare (5) | N/A | N/A | ||
| Kimi K2 Instruct | 0.5 | 2.00 | 262K | 33K | compare (9) | N/A | N/A | ||
| Llama 4 17B Maverick Instruct | 0.05 | 0.1 | 1.0M | 16K | compare (9) | N/A | N/A | ||
| Llama 4 17B Scout Instruct | 0.05 | 0.1 | 10.0M | 16K | compare (12) | N/A | N/A | ||
| LlamaGuard 4 12B | 0.18 | 0.18 | 164K | 16K | compare (4) | N/A | N/A | ||
| PlayAI TTS | N/A | N/A | 10K | 10K | compare (1) | N/A | N/A | ||
| Qwen3 32B | 0.05 | 0.1 | 131K | 41K | compare (15) | N/A | N/A | ||
| Whisper 3 Large | N/A | N/A | N/A | N/A | compare (1) | N/A | N/A | ||
| Whisper 3 Large Turbo | N/A | N/A | N/A | N/A | compare (3) | N/A | N/A |