GPT-4o Audio is
OpenAI's language model with a 128K context window and up to 16K output tokens, available from 2 providers. A multimodal GPT-4o variant that accepts and produces audio inputs and outputs alongside text, enabling voice-capable conversational applications.
Capabilities
Input1/5
·
·
✓
·
·
Output1/5
·
·
✓
·
·
Capabilities2/13
·
·
✓
✓
·
·
·
·
·
·
·
·
·
Pricing by Provider
| Provider | Standard | |||
|---|---|---|---|---|
| Input $ / 1M | Output $ / 1M | Audio In $ / 1M | Audio Out $ / 1M | |
Azure AI Foundry | $2.50 | $10.00 | $2.50 | $80.00 |
OpenAI | $2.50 | $10.00 | $40.00 | $80.00 |
Cost Calculator
Preset:
Compares every provider & tier in USD
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| GPT-5.4 Mini | 1.1M | $0.750 | $4.50 | Available | |
| GPT-5.4 Nano | 1.1M | $0.200 | $1.25 | Available | |
| GPT-5.4 | 1.1M | $2.50 | $15.00 | Available | |
| GPT-5.4 Pro | 1.1M | $30.00 | $180.00 | Available | |
| GPT-5.4 3.5 | — | 1.1M | — | — | Available |
| GPT-5.4 Pro 3.5 | — | 1.1M | — | — | Available |
| GPT-5.3 Chat | 128K | $1.75 | $14.00 | Available | |
| GPT-5.3 Codex | 400K | $1.75 | $14.00 | Available | |
| GPT-5.3 Codex Spark | — | 128K | — | — | Available |
| GPT-5.3 Instant | — | 128K | — | — | Available |
| GPT-4o Audio | — | 128K | — | — | Current |