Qwen3 4 b Fp8 Pricing & Specs | AI Models

Qwen3 4B Fp8 is a text model from Novita AI with a context window of 128K tokens and max output of 20K tokens. Pricing starts at 0.03 per million input tokens and 0.03 per million output tokens (cheapest at Fireworks AI).

Capabilities

✗ Vision✗ Function Calling✓ Reasoning✗ JSON Schema✓ System Messages✗ Web Search✗ Prompt Caching✗ Audio Input✗ Audio Output

Specifications

Model Key	`novita/qwen/qwen3-4b-fp8`
Provider	Novita AI
Provider ID	novita
Mode	Text
Canonical Name	qwen-3-4b
Context Window	128K tokens
Max Output	20K tokens

Pricing

Type	Per 1K Tokens	Per 1M Tokens
Input Tokens	0.000030	0.030
Output Tokens	0.000030	0.030

Benchmarks

Intelligence Index	12.5#161
MMLU-Pro	0.6#146
GPQA	0.4#174
HLE	0.0#194
LiveCodeBench	0.2#137
AIME	0.2#63
Time to First Token	0.95s#162
SciCode	0.2#176
MATH-500	0.8#60

Price Comparison by Provider

Compare prices for Qwen3 4B Fp8 across different providers. The same model may be available through multiple providers at different price points.

Provider	Model Key	Input Price, $	Output Price, $
Novita AI	novita/qwen/qwen3-4b-fp8	0.030	0.030
Nebius	nebius/Qwen/Qwen3-4B	0.080	0.240
Fireworks AI	fireworks_ai/accounts/fireworks/models/qwen3-embedding-4b	N/A	N/A

All Variants

All available versions, regions, and API endpoints for Qwen3 4B Fp8.

Model Key	Provider	Mode	Input Price, $	Output Price, $	Context	Max Output	Vision	Functions
fireworks_ai/accounts/fireworks/models/qwen3-4b	Fireworks AI	Text	0.200	0.200	41K	41K	no	no
fireworks_ai/accounts/fireworks/models/qwen3-embedding-4b	Fireworks AI	Embedding	N/A	N/A	41K	41K	no	no
fireworks_ai/accounts/fireworks/models/qwen3-reranker-4b	Fireworks AI	Rerank	N/A	N/A	41K	41K	no	no
nebius/Qwen/Qwen3-4B	Nebius	Text	0.080	0.240	33K	33K	no	yes
novita/qwen/qwen3-4b-fp8	Novita AI	Text	0.030	0.030	128K	20K	no	no

← Back to All Models