Meta logo

Llama 3.2 90B Vision Instruct


Llama 3.2 90B Vision Instruct is Meta logoMeta's language model with a 128K context window and up to 8K output tokens, starting at $0.720 / 1M input and $0.720 / 1M output. A 90B instruction-tuned vision-language Llama 3.2 model optimized for high-accuracy visual recognition, image reasoning, and captioning at large scale.
Spec
Canonical IDmeta-llama-3-2-90b
TypeLanguage
StatusActive
CreatorMetaMeta
Providers
Context Window128K tokens
Max Output8K tokens
Input ModalitiesImage
Output ModalitiesText
Parameters90B
Release Date · 2 years ago

Capabilities

Input1/5
Text·
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Output
$ / 1M
Vercel AI Gateway logo
Vercel AI Gateway
meta/llama-3.2-90b
$0.720$0.720

Cost Calculator

Preset:
Compares every provider & tier in USD

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.3 70B Instruct131K$0.720$0.720Available
Llama 3.3 70B Instruct131K$0.100$0.300Available
Llama 3.3Available
Llama 3.3 70B Instruct Turbo131K$0.130$0.390Available
Llama 3.3 70B Versatile128K$0.590$0.790Available
Llama 3.3 8B Instruct128KAvailable
Llama 3.2 90B Vision Instruct128K$0.720$0.720Current
Llama 3.2 11B Vision Instruct128K$0.160$0.160Available
Llama 3.2 1B Instruct128K$0.027$0.080Deprecated
Llama 3.2 3B Instruct131K$0.015$0.020Deprecated
Llama 3.2 1B131K$0.100$0.100Available

Model IDs