Qwen3 VL 235B A22B Instruct is Alibaba's language model with a 262K context window and up to 129K output tokens, available from 7 providers, starting at $0.200 / 1M input and $0.880 / 1M output. The flagship instruction-tuned vision-language MoE model in the Qwen3 series, with 235B total and 22B activated parameters for superior visual perception and reasoning.
Specifications
Canonical IDalibaba-qwen3-vl-235b-a22b-instruct
TypeLanguage
StatusActive
CreatorAlibabaAlibaba
Providers
Context Window262K tokens
Max Output129K tokens
Input ModalitiesImagePdfText
Output ModalitiesText
Parameters235B
HuggingFace Likes383
HuggingFace Downloads (30d)947,793
HuggingFace Downloads (all-time)2,172,030
Release Date · 8 months ago
Knowledge Cutoff
Benchmarks
Intelligence Index
20.8
#215
Coding Index
16.5
#209
Math Index
70.7
#87
MMLU-Pro
0.8
#64
GPQA
0.7
#178
HLE
0.1
#209
LiveCodeBench
0.6
#107
IFBench
0.4
#197
Time to First Token
1.09s
#343
SciCode
0.4
#160
AIME 2025
0.7
#87
LCR
0.3
#199
TerminalBench Hard
0.1
#218
TAU2
0.4
#200
Output TPS
47.9
#229

Capabilities

Input3/5
Text
Image
Audio·
Video·
PDF
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities4/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling
Structured Outputs
Native JSON Schema
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Qwen3 VL 30B A3B Instruct262K$0.130$0.520Available
Qwen3 VL 30B A3B Thinking262K$0.130$0.600Available
Qwen3 VL 235B A22B Instruct262K$0.200$0.880Current
Qwen3 VL 235B A22B Thinking262K$0.220$0.880Available

Model IDs