Voxtral Small 24B is Mistral AI's language model with a 128K context window and up to 8K output tokens, available from 2 providers, starting at $0.100 / 1M input and $0.300 / 1M output. A 24B-parameter audio-language model from Mistral built on Mistral Small 3, excelling at speech transcription, translation, and audio understanding.
Specifications
Canonical IDmistral-voxtral-small
TypeLanguage
StatusActive
CreatorMistral AIMistral AI
Providers
Context Window128K tokens
Max Output8K tokens
Input ModalitiesAudioText
Output ModalitiesText
Release Date · 7 months ago

Capabilities

Input2/5
Text
Image·
Audio
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities3/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs
Native JSON Schema
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandardBatchFlexPriority
Input
$ / 1M
Output
$ / 1M
Cache Read
$ / 1M
Audio In
$ / 1M
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Amazon Bedrock logo
Amazon Bedrock
mistral.voxtral-small-24b-2507
$0.100$0.300N/AN/A$0.050$0.150$0.050$0.150$0.170$0.520
OpenRouter logo
OpenRouter
mistralai/voxtral-small-24b-2507
$0.100$0.300$0.010$100.00

Cost Calculator

Preset:

Other models

ModelTierReleasedContextInput / 1MOutput / 1M
Voxtral Mini 3BMini128K$0.040$0.040
Voxtral TTS
Voxtral Mini Transcribe RealtimeMini$0.006
Voxtral Mini TTSMini$0.000$16.00

Model IDs