Qwen3 8 b Fp8 Pricing & Specs | AI Models

Qwen3 8B Fp8 is a text model from Novita AI with a context window of 128K tokens and max output of 20K tokens. Pricing starts at 0.04 per million input tokens and 0.14 per million output tokens (cheapest at Fireworks AI).

Capabilities

✗ Vision✗ Function Calling✓ Reasoning✗ JSON Schema✓ System Messages✗ Web Search✗ Prompt Caching✗ Audio Input✗ Audio Output

Specifications

Model Key	`novita/qwen/qwen3-8b-fp8`
Provider	Novita AI
Provider ID	novita
Mode	Text
Canonical Name	qwen-3-8b
Context Window	128K tokens
Max Output	20K tokens

Pricing

Type	Per 1K Tokens	Per 1M Tokens
Input Tokens	0.000035	0.035
Output Tokens	0.000138	0.138

Benchmarks

Intelligence Index	10.6#179
Coding Index	7.1#150
Math Index	24.3#92
MMLU-Pro	0.6#135
GPQA	0.5#152
HLE	0.0#220
LiveCodeBench	0.2#144
AIME	0.2#57
IFBench	0.3#145
Time to First Token	0.98s#165
SciCode	0.2#175
MATH-500	0.8#63
AIME 2025	0.2#92
LCR	0.0#151
TerminalBench Hard	0.0#124
TAU2	0.2#105

Price Comparison by Provider

Compare prices for Qwen3 8B Fp8 across different providers. The same model may be available through multiple providers at different price points.

Provider	Model Key	Input Price, $	Output Price, $
Novita AI	novita/qwen/qwen3-8b-fp8	0.035	0.138
LlamaGate	llamagate/qwen3-8b	0.040	0.140
Fireworks AI	fireworks_ai/accounts/fireworks/models/qwen3-reranker-8b	N/A	N/A

All Variants

All available versions, regions, and API endpoints for Qwen3 8B Fp8.

Model Key	Provider	Mode	Input Price, $	Output Price, $	Context	Max Output	Vision	Functions
fireworks_ai/accounts/fireworks/models/qwen3-8b	Fireworks AI	Text	0.200	0.200	41K	41K	no	no
fireworks_ai/accounts/fireworks/models/qwen3-reranker-8b	Fireworks AI	Rerank	N/A	N/A	41K	41K	no	no
llamagate/qwen3-8b	LlamaGate	Text	0.040	0.140	33K	8K	no	yes
novita/qwen/qwen3-8b-fp8	Novita AI	Text	0.035	0.138	128K	20K	no	no
novita/qwen/qwen3-embedding-8b	Novita AI	Embedding	0.070	N/A	33K	4K	no	no
novita/qwen/qwen3-reranker-8b	Novita AI	Rerank	0.050	0.050	33K	4K	no	no

← Back to All Models