CosyVoice 3 Flash is Alibaba's language model, starting at $0.13 / 1M input. A fast-tier CosyVoice 3 TTS model designed for efficient, low-latency voice generation.
Specifications
Canonical IDalibaba-cosyvoice-3-flash
TypeLanguage
StatusActive
CreatorAlibabaAlibaba
Providers
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandardBatch
Input
$ / 1M
Input
$ / 1M
Alibaba Qwen logo
Alibaba Qwen
cosyvoice-v3-flash
$0.13$0.065

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
CosyVoice 3 Flash$0.130Current
CosyVoice 3.5 Flash$0.116Available

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
CosyVoice 3 PlusPlus$0.260
CosyVoice 3.5 PlusPlus$0.220
CosyVoice 2$0.287

Model IDs

alibaba-cosyvoice-3-flash
cosyvoice-v3-flash