Qwen3 4B is Alibaba's language model with a 131K context window and up to 20K output tokens, available from 4 providers, starting at $0.03 / 1M input and $0.03 / 1M output. A compact 4B-parameter dense LLM from the Qwen3 series supporting hybrid thinking and non-thinking modes for efficient on-device or low-latency deployment.
Specifications
Canonical IDalibaba-qwen3-4b
TypeLanguage
StatusActive
CreatorAlibabaAlibaba
Providers
Context Window131K tokens
Max Output20K tokens
Input ModalitiesText
Output ModalitiesText
Reasoning Effortsdefault
Parameters4B

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities2/13
Reasoning
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandardBatch
Input
$ / 1M
Output
$ / 1M
Input
$ / 1M
Output
$ / 1M
Alibaba Qwen logo
Alibaba Qwen
qwen3-4b
$0.11$0.42$0.055$0.21
Fireworks AI logo
Fireworks AI
fireworks_ai/accounts/fireworks/models/qwen3-4b
$0.2$0.2
Nebius logo
Nebius
nebius/Qwen/Qwen3-4B
$0.08$0.24
Novita logo
Novita
novita/qwen/qwen3-4b-fp8
$0.03$0.03

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Voyage 4 NanoAvailable
Qwen3 Embedding 0.6B33K$0.010Available
Qwen3 Embedding 4B41K$0.020Available
Qwen3 Embedding 8B41K$0.020Available
Qwen3 14B132K$0.060$0.200Available
Qwen3 32B131K$0.050$0.100Available
Qwen3 8B131K$0.035$0.138Available
Qwen3 4B131K$0.030$0.030Current
Qwen3 4B Instruct262K$0.010$0.030Available
KwaiPilot KAT 32B Dev131K$0.900$0.900Available
Qwen3 0.6B41K$0.100$0.100Available

Model IDs

accounts/fireworks/models/qwen3-4b
alibaba-qwen3-4b
fireworks_ai/accounts/fireworks/models/qwen3-4b
huggingface-reasoning-qwen3-4b
nebius/Qwen/Qwen3-4B
novita/qwen/qwen3-4b-fp8
qwen/qwen3-4b-fp8
qwen3-4b