Llama 3.2 90B Vision Instruct is Meta's language model with a 128K context window and up to 16K output tokens, available from 4 providers, starting at $0.9 / 1M input and $0.9 / 1M output. Meta's 90B instruction-tuned vision-language model from Llama 3.2, optimized for high-capacity visual recognition, reasoning, and captioning tasks.
Specifications
Canonical IDmeta-llama-3-2-90b-vision-instruct
TypeLanguage
StatusActive
CreatorMetaMeta
Providers
Context Window128K tokens
Max Output16K tokens
Input ModalitiesImage
Output ModalitiesText
Parameters90B

Capabilities

Input1/5
Text·
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities3/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling
Structured Outputs
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Input
$ / 1M
Output
$ / 1M
Azure AI Foundry logo
Azure AI Foundry
azure_ai/Llama-3.2-90B-Vision-Instruct
$2.04$2.04
Fireworks AI logo
Fireworks AI
fireworks_ai/accounts/fireworks/models/llama-v3p2-90b-vision-instruct
$0.9$0.9
IBM watsonx logo
IBM watsonx
watsonx/meta-llama/llama-3-2-90b-vision-instruct
$2.00$2.00
Oracle Cloud (OCI) logo
Oracle Cloud (OCI)
oci/meta.llama-3.2-90b-vision-instruct
$2.00$2.00

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.2 11B Vision Instruct131K$0.015$0.025Available
Llama 3.2 90B Vision Instruct128K$0.900$0.900Current

Model IDs

accounts/fireworks/models/llama-v3p2-90b-vision-instruct
azure_ai/Llama-3.2-90B-Vision-Instruct
fireworks_ai/accounts/fireworks/models/llama-v3p2-90b-vision-instruct
meta-llama-3-2-90b-vision-instruct
meta-vlm-llama-3-2-90b-vision-instruct
oci/meta.llama-3.2-90b-vision-instruct
vertex_ai/meta/llama-3.2-90b-vision-instruct-maas
watsonx/meta-llama/llama-3-2-90b-vision-instruct