GPT-4o Realtime Cached Audio is OpenAI's language model. A cached audio endpoint of GPT-4o Realtime that leverages implicit caching to reduce latency and cost for repeated audio interactions.
Specifications
Canonical IDopenai-gpt-4o-realtime-cached-audio
TypeLanguage
StatusActive
CreatorOpenAIOpenAI
Providers
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Audio In
$ / 1M
Azure AI Foundry logo
Azure AI Foundry
openai:gpt4orealtimecachedaudio
$20.00
View Azure AI Foundry

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GPT-5.51.1M$5.00$30.00Available
GPT-5.4 Mini1.1M$0.750$4.50Available
GPT-5.4 Nano1.1M$0.200$1.25Available
GPT-5.41.1M$2.50$15.00Available
GPT-5.3 Codex400K$1.75$14.00Available
GPT-5.2 Codex400K$1.75$14.00Available
GPT-5.2410K$1.75$14.00Available
GPT-5.1410K$1.25$10.00Available
GPT-5.1 Codex400K$1.25$10.00Available
GPT-5.1 Codex Mini400K$0.250$2.00Available
GPT-4o Realtime Cached AudioCurrent

Model IDs