Qwen3.5 4B is Alibaba's language model. A compact 4B-parameter model from the Qwen3.5 series, balancing small size with capable text generation for efficient deployment scenarios.
Capabilities
Input1/5
Text✓
Image·
Audio·
Video·
PDF·
Output1/5
Text✓
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·
Cheapest Instances to Run It
Cloud GPU instances that can host Qwen3.5 4B, ranked by cheapest on-demand price. The model needs about 10 GB of GPU memory at FP16 precision (estimated from its parameter count), so treat the fit as guidance rather than a guarantee.
All clouds
FP16 (full precision)
US Dollar ($)
Instance | Cloud | GPU | VRAM | Price | Cheapest region | |
|---|---|---|---|---|---|---|
| Standard_NV4as_v4 | AMD Radeon Instinct MI25 | 16 GB | $0.233/hr | westus2 | ||
| g5g.xlarge | T4g | 16 GB | $0.420/hr | us-east-1 | ||
| Standard_NV8as_v4 | AMD Radeon Instinct MI25 | 16 GB | $0.466/hr | westus2 | ||
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| EAGLE Qwen 2.5 3B Instruct | — | — | — | — | Available |
| Qwen3.7 Plus | 1.0M | $0.320 | $1.28 | Available | |
| Qwen3.7 Max | 1.0M | $1.25 | $3.75 | Available | |
| Qwen3.6 Max Preview | 262K | $1.04 | $6.24 | Available | |
| Qwen3.6 27B | 262K | $0.150 | $0.500 | Available | |
| Qwen3.6 35B A3B | 262K | $0.140 | $0.450 | Available | |
| Qwen3.6 Plus | 1.0M | $0.325 | $1.95 | Available | |
| Qwen3 Max Thinking | 262K | $0.780 | $3.90 | Available | |
| Qwen3 Max | 262K | $0.780 | $3.90 | Available | |
| Qwen3 Coder 30B A3B | 262K | $0.150 | $0.600 | Available | |
| Qwen3.5 4B | — | — | — | — | Current |