CosyVoice 3.5 Flash is Alibaba's language model, starting at $0.116 / 1M input. A fast-tier variant of the CosyVoice 3.5 TTS model optimized for low-latency speech synthesis.
Specifications
Canonical IDalibaba-cosyvoice-3-5-flash
TypeLanguage
StatusActive
CreatorAlibabaAlibaba
Providers
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandardBatch
Input
$ / 1M
Input
$ / 1M
Alibaba Qwen logo
Alibaba Qwen
cosyvoice-v3.5-flash
$0.116$0.058

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
CosyVoice 3.5 Flash$0.116Current
CosyVoice 3 Flash$0.130Available

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
CosyVoice 3 PlusPlus$0.260
CosyVoice 3.5 PlusPlus$0.220
CosyVoice 2$0.287

Model IDs

alibaba-cosyvoice-3-5-flash
cosyvoice-v3.5-flash