Alibaba logo

Qwen2 VL 72B Instruct


Qwen2 VL 72B Instruct is Alibaba logoAlibaba's language model with a 131K context window and up to 2K output tokens, available from 3 providers, starting at $0.130 / 1M input and $0.400 / 1M output. A 72-billion-parameter vision-language model from the Qwen2-VL series, delivering high-capacity multimodal understanding across images and text.
Spec
Canonical IDalibaba-qwen2-vl-72b-instruct
TypeLanguage
StatusActive
CreatorAlibabaAlibaba
Providers
Context Window131K tokens
Max Output2K tokens
Input ModalitiesImage
Output ModalitiesText
Parameters72B

Capabilities

Input1/5
Text·
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandardBatch
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Nebius logo
Nebius
Qwen/Qwen2-VL-72B-Instruct
$0.130$0.400
Fireworks AI logo
Fireworks AI
accounts/fireworks/models/qwen2-vl-72b-instruct
$0.900$0.900
Alibaba Qwen logo
Alibaba Qwen
qwen2-vl-72b-instruct
$2.29$6.88$1.15$3.44

Cost Calculator

Preset:
Compares every provider & tier in USD

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Qwen2 VL 72B Instruct131K$0.130$0.400Current
Qwen2 VL 2B Instruct33K$0.100$0.100Available
Qwen2 VL 7B Instruct131K$0.020$0.060Available

Model IDs