Google logo

Gemini Multimodal Embeddings


Gemini Multimodal Embeddings is Google logoGoogle's embedding model, available from 2 providers, starting at $0.200 / 1M input and $N/A / 1M output. A Gemini embedding model that encodes text, image, and other modalities into a shared vector space.
Spec
Canonical IDgoogle-gemini-multimodal-embeddings
TypeEmbedding
StatusActive
CreatorGoogleGoogle
Providers
Input ModalitiesText
Output ModalitiesEmbedding

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text·
Image·
Audio·
Video·
Embedding
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Audio In
$ / 1M
Google Gemini logo
Google Gemini
gemini-multimodal-embeddings
$0.200$6.50
Google Vertex AI logo
Google Vertex AI
gemini-multimodal-embeddings
$0.200$6.50

Cost Calculator

Preset:
Compares every provider & tier in USD

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Gemini Multimodal Embeddings$0.200$0.000Current
Gemini Live Multimodal Embeddings$0.200$0.000Available
Gemma 300M EmbeddingAvailable
TextEmbedding GeckoAvailable

Model IDs