GPT-4o Transcribed Audio is OpenAI's language model. An audio-modality endpoint of GPT-4o for retrieving transcribed audio content in streaming or batch workflows.
Specifications
Canonical IDopenai-gpt-4o-transcribed-audio
TypeLanguage
StatusActive
CreatorOpenAIOpenAI
Providers
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Audio In
$ / 1M
Azure AI Foundry logo
Azure AI Foundry
openai:gpt4otcrbdaud
$6.00

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GPT-5.51.1M$5.00$30.00Available
GPT-5.4 Mini1.1M$0.750$4.50Available
GPT-5.4 Nano1.1M$0.200$1.25Available
GPT-5.41.1M$2.50$15.00Available
GPT-5.3 Codex400K$1.75$14.00Available
GPT-5.2 Codex400K$1.75$14.00Available
GPT-5.2410K$1.75$14.00Available
GPT-5.1410K$1.25$10.00Available
GPT-5.1 Codex400K$1.25$10.00Available
GPT-5.1 Codex Mini400K$0.250$2.00Available
GPT-4o Transcribed AudioCurrent

Model IDs

openai-gpt-4o-transcribed-audio