DeepSeek R1 Distill Qwen3 8B is DeepSeek's language model with a 131K context window, starting at $0.2 / 1M input and $0.2 / 1M output. An 8B model distilled from DeepSeek R1 0528's chain-of-thought into the Qwen3 8B base, achieving strong open-source reasoning benchmark performance.
Specifications
Canonical IDdeepseek-r1-distill-qwen3-8b
TypeLanguage
StatusActive
CreatorDeepSeekDeepSeek
Providers
Context Window131K tokens
Input ModalitiesText
Output ModalitiesText
Parameters8B

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Input
$ / 1M
Output
$ / 1M
Fireworks AI logo
Fireworks AI
fireworks_ai/accounts/fireworks/models/deepseek-r1-0528-distill-qwen3-8b
$0.2$0.2

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
DeepSeek R1T2 Chimera164KAvailable
DeepSeek R1 528164K$0.200$0.250Available
DeepSeek R1 Distill Qwen 32B131K$0.150$0.150Available
DeepSeek R1 Distill Llama 70B131K$0.200$0.375Deprecated
DeepSeek R1164K$0.280$0.400Available
DeepSeek R1 Distill Qwen3 8B131K$0.200$0.200Current
DeepSeek R1 Distill Qwen 14B131K$0.070$0.070Available
DeepSeek R1 Distill Llama 8B131K$0.025$0.025Available
DeepSeek R1 Distill Qwen 1.5B131K$0.090$0.090Available
DeepSeek R1 528 Turbo33K$1.00$3.00Available
DeepSeek R1 528B131K$0.550$2.19Available

Model IDs

accounts/fireworks/models/deepseek-r1-0528-distill-qwen3-8b
deepseek-r1-distill-qwen3-8b
fireworks_ai/accounts/fireworks/models/deepseek-r1-0528-distill-qwen3-8b