Voyage Multimodal 3 is Voyage's embedding model with a 32K context window, available from 2 providers, starting at $0.120 / 1M input and $0.120 / 1M output. Voyage AI's multimodal embedding model supporting joint text and image retrieval for cross-modal AI applications.
Specifications
Canonical IDvoyage-multimodal-3
TypeEmbedding
StatusActive
CreatorVoyageVoyage
Providers
Context Window32K tokens
Input ModalitiesText
Output ModalitiesEmbedding

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text·
Image·
Audio·
Video·
Embedding
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandardBatch
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Snowflake logo
Snowflake
voyage-multimodal-3
$0.120$0.120$0.060$0.060
Voyage logo
Voyage
voyage/voyage-multimodal-3
$0.120N/A

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Text Embedding 52K$0.025Available
Embed 4128K$0.120$0.470Available
Embed 4 Img$0.470Available
Embed 4 Txt$0.120Available
Text Embedding 42K$0.100Deprecated
Voyage 432K$0.060Available
Voyage 4 Large32K$0.120Available
Voyage 4 Lite32K$0.020Available
Voyage 3.532K$0.060Available
Voyage 3.5 Lite32K$0.020Available
Voyage Multimodal 332K$0.120$0.120Current

Model IDs