Name: Whisper 3
Brand: OpenAI

Whisper 3 is OpenAI's speech to text model with a 4K context window and up to 4K output tokens. The third-generation Whisper ASR model with improved multilingual speech recognition accuracy across diverse audio conditions.

Specifications
Canonical ID	`openai-whisper-3`
Type	Speech to Text
Status	Active
Creator	OpenAI
Context Window	4K tokens
Max Output	4K tokens
Input Modalities	Audio
Output Modalities	Text

Capabilities

Input1/5

Text·

Image·

Audio✓

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities0/13

Reasoning·

Adaptive Reasoning·

Function Calling·

Parallel Function Calling·

Structured Outputs·

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
Whisper 3	—	4K	—	—	Current
Whisper 3 Large	—	—	—	—	Available
Whisper 3 Large Turbo	—	—	—	—	Available
Whisper 3 Turbo	—	4K	—	—	Available
Whisper 2 Large	—	—	—	—	Available
Whisper	2022-09-21	—	$0.000	—	Available
Whisper Base	—	—	—	—	Available
Whisper Large	—	—	—	—	Available
Whisper Medium	—	—	—	—	Available
Whisper Small	—	—	—	—	Available
Whisper Tiny	—	—	—	—	Available

Model IDs

fireworks_ai/accounts/fireworks/models/whisper-v3

openai-whisper-3

Whisper 3

CapabilitiesAPIGET/api/v1/models/openai-whisper-3

VersionsAPIGET/api/v1/models?family=whisper

Model IDsAPIGET/api/v1/models/openai-whisper-3

Capabilities

Versions

Model IDs