Cohere logo

Aya Vision


Aya Vision is Cohere logoCohere's language model with a 16K context window and up to 4K output tokens. A compact 8B multimodal model supporting vision and text across multiple languages, optimized for low-latency multilingual image-and-text tasks.
Spec
Canonical IDcohere-aya-vision-8b
TypeLanguage
StatusActive
CreatorCohereCohere
Context Window16K tokens
Max Output4K tokens
Input ModalitiesImageText
Output ModalitiesText
Parameters8B

Capabilities

Input2/5
Text
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Aya Vision16KCurrent
Aya 101Available
Aya 101Available
Aya Expanse128K$0.500$1.50Available
Aya Expanse8K$0.500$1.50Available
Aya Vision16KAvailable

Model IDs