Qwen3 4B

Qwen3 4B is a text model from Nebius with a context window of 33K tokens and max output of 33K tokens. Pricing starts at 0.08 per million input tokens and 0.24 per million output tokens (cheapest at Fireworks AI).

Capabilities

Vision Function Calling Reasoning JSON Schema System Messages Web Search Prompt Caching Audio Input Audio Output

Specifications

Model Keynebius/Qwen/Qwen3-4B
ProviderNebius
Provider IDnebius
ModeText
Canonical Nameqwen-3-4b
Context Window33K tokens
Max Output33K tokens

Pricing

TypePer 1K TokensPer 1M Tokens
Input Tokens0.0000800.080
Output Tokens0.0002400.240

Benchmarks

Intelligence Index12.5#161
MMLU-Pro0.6#146
GPQA0.4#174
HLE0.0#192
LiveCodeBench0.2#137
AIME0.2#63
Time to First Token0.94s#158
SciCode0.2#176
MATH-5000.8#60

Price Comparison by Provider

Compare prices for Qwen3 4B across different providers. The same model may be available through multiple providers at different price points.

Provider
Model Key
Input Price, $
Output Price, $
Novita AInovita/qwen/qwen3-4b-fp80.0300.030
Nebiusnebius/Qwen/Qwen3-4B0.0800.240
Fireworks AIfireworks_ai/accounts/fireworks/models/qwen3-embedding-4bN/AN/A

All Variants

All available versions, regions, and API endpoints for Qwen3 4B.

Model Key
Provider
Mode
Input Price, $
Output Price, $
Context
Max Output
Vision
Functions
fireworks_ai/accounts/fireworks/models/qwen3-4bFireworks AIText0.2000.20041K41Knono
fireworks_ai/accounts/fireworks/models/qwen3-embedding-4bFireworks AIEmbeddingN/AN/A41K41Knono
fireworks_ai/accounts/fireworks/models/qwen3-reranker-4bFireworks AIRerankN/AN/A41K41Knono
nebius/Qwen/Qwen3-4BNebiusText0.0800.24033K33Knoyes
novita/qwen/qwen3-4b-fp8Novita AIText0.0300.030128K20Knono