GPT-4o-audio Preview
GPT-4o-audio Preview is a text model from
OpenAI with a context window of 128K tokens and max output of 16K tokens. Pricing starts at 2.50 per million input tokens and 10.00 per million output tokens.
Capabilities
✗ Vision✓ Function Calling✗ Reasoning✗ JSON Schema✓ System Messages✗ Web Search✗ Prompt Caching✓ Audio Input✓ Audio Output
Specifications
| Model Key | gpt-4o-audio-preview-2025-06-03 |
| Provider | |
| Provider ID | openai |
| Mode | Text |
| Canonical Name | gpt-4o-audio |
| Context Window | 128K tokens |
| Max Output | 16K tokens |
Pricing
| Type | Per 1K Tokens | Per 1M Tokens |
|---|---|---|
| Input Tokens | 0.0025 | 2.50 |
| Output Tokens | 0.010 | 10.00 |
Benchmarks
No benchmark data is available for this model.
All Variants
All available versions, regions, and API endpoints for GPT-4o-audio Preview.
Model Key | Provider | Mode | Input Price, $ | Output Price, $ | Context | Max Output | Vision | Functions |
|---|---|---|---|---|---|---|---|---|
| gpt-4o-audio-preview | Text | 2.50 | 10.00 | 128K | 16K | no | yes | |
| gpt-4o-audio-preview-2024-10-01 | Text | N/A | N/A | 128K | 16K | no | yes | |
| gpt-4o-audio-preview-2024-12-17 | Text | 2.50 | 10.00 | 128K | 16K | no | yes | |
| gpt-4o-audio-preview-2025-06-03 | Text | 2.50 | 10.00 | 128K | 16K | no | yes |