GPT-4o Transcribe is OpenAI's speech to text model with a 16K context window and up to 2K output tokens, available from 2 providers, starting at $2.50 / 1M input and $10.00 / 1M output. A speech-to-text ASR model powered by GPT-4o that delivers improved word error rate and language recognition over earlier Whisper models.
Specifications
Canonical IDopenai-gpt-4o-transcribe
TypeSpeech to Text
StatusActive
CreatorOpenAIOpenAI
Providers
Context Window16K tokens
Max Output2K tokens
Input ModalitiesAudioText
Output ModalitiesText
Knowledge Cutoff

Capabilities

Input2/5
Text
Image·
Audio
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandardBatchFlexPriority
Input
$ / 1M
Output
$ / 1M
Audio In
$ / 1M
Audio Out
$ / 1M
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Azure AI Foundry logo
Azure AI Foundry
azure/gpt-4o-transcribe
$2.50$10.00$2.50N/A
OpenAI logo
OpenAI
gpt-4o-transcribe
$2.50$10.00$2.50$10.00$2.50$10.00$2.50$10.00$2.50$10.00

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GPT-5.51.1M$5.00$30.00Available
GPT-5.4 Mini1.1M$0.750$4.50Available
GPT-5.4 Nano1.1M$0.200$1.25Available
GPT-5.41.1M$2.50$15.00Available
GPT-5.3 Codex400K$1.75$14.00Available
GPT-5.2 Codex400K$1.75$14.00Available
GPT-5.2410K$1.75$14.00Available
GPT-5.1410K$1.25$10.00Available
GPT-5.1 Codex400K$1.25$10.00Available
GPT-5.1 Codex Mini400K$0.250$2.00Available
GPT-4o Transcribe16K$2.50$10.00Current

Model IDs