Qwen3.7 Plus VL Instruct is Alibaba's language model. A Qwen3 mixture-of-experts vision-language model designed for multimodal instruction-following, combining image understanding with strong language capabilities.
Capabilities
Input1/5
Text✓
Image·
Audio·
Video·
PDF·
Output1/5
Text✓
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·
Other Models
| Model | Tier | Released | Context | Input / 1M | Output / 1M |
|---|---|---|---|---|---|
| Qwen3.5 122B A10B | — | 262K | $0.260 | $2.08 | |
| Qwen3.5 35B A3B | — | 262K | $0.140 | $1.00 | |
| Qwen3.5 397B A17B | — | 262K | $0.390 | $2.34 |