GPT-4o Transcribe is OpenAI's speech to text model with a 16K context window and up to 2K output tokens, available from 2 providers, starting at $2.50 / 1M input and $10.00 / 1M output. A speech-to-text ASR model powered by GPT-4o that delivers improved word error rate and language recognition over earlier Whisper models.
openai-gpt-4o-transcribe |
| Speech to Text |
| Active |
| 16K tokens |
| 2K tokens |
| AudioText |
| Text |
Capabilities
Input2/5
Text✓
Image·
Audio✓
Video·
PDF·
Output1/5
Text✓
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·
Pricing by Provider
| Provider | Standard | Batch | Flex | Priority | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Input $ / 1M | Output $ / 1M | Audio In $ / 1M | Audio Out $ / 1M | Input $ / 1M | Output $ / 1M | Input $ / 1M | Output $ / 1M | Input $ / 1M | Output $ / 1M | |
Azure AI Foundry | $2.50 | $10.00 | $2.50 | N/A | — | — | — | — | — | — |
OpenAI | $2.50 | $10.00 | $2.50 | $10.00 | $2.50 | $10.00 | $2.50 | $10.00 | $2.50 | $10.00 |
Cost Calculator
Preset:
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| GPT-5.5 | 1.1M | $5.00 | $30.00 | Available | |
| GPT-5.4 Mini | 1.1M | $0.750 | $4.50 | Available | |
| GPT-5.4 Nano | 1.1M | $0.200 | $1.25 | Available | |
| GPT-5.4 | 1.1M | $2.50 | $15.00 | Available | |
| GPT-5.3 Codex | 400K | $1.75 | $14.00 | Available | |
| GPT-5.2 Codex | 400K | $1.75 | $14.00 | Available | |
| GPT-5.2 | 410K | $1.75 | $14.00 | Available | |
| GPT-5.1 | 410K | $1.25 | $10.00 | Available | |
| GPT-5.1 Codex | 400K | $1.25 | $10.00 | Available | |
| GPT-5.1 Codex Mini | 400K | $0.250 | $2.00 | Available | |
| GPT-4o Transcribe | — | 16K | $2.50 | $10.00 | Current |