Meta logo

Llama 3.2 90B Vision


Llama 3.2 90B Vision is Meta logoMeta's language model. Meta's 90B pre-trained vision-language model from the Llama 3.2 series, enabling large-scale image-text-to-text multimodal generation.
Spec
Canonical IDmeta-llama-3-2-90b-vision
TypeLanguage
StatusActive
CreatorMetaMeta
Input ModalitiesImageText
Output ModalitiesText
Parameters90B

Capabilities

Input2/5
Text
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.2 11B128K$0.160$0.160Available
Llama 3.2 11B Instruct128K$0.350$0.350Deprecated
Llama 3.2 1B Instruct128K$0.027$0.080Deprecated
Llama 3.2 3B Instruct131K$0.015$0.020Deprecated
Llama 3.2 90B128K$0.720$0.720Available
Llama 3.2 90B Instruct128K$2.00$2.00Deprecated
Llama 3.2 1B131K$0.100$0.100Available
Llama 3.2 3B131K$0.040$0.080Available
Llama 3.1 405B Instruct131K$0.120$0.300Deprecating
Llama 3.1 70B128K$0.600$0.600Available
Llama 3.2 90B VisionCurrent

Model IDs