DeepSeek R1 Distill Llama 8B is DeepSeek's language model with a 131K context window, available from 3 providers, starting at $0.025 / 1M input and $0.025 / 1M output. A compact 8B Llama-based model distilled from DeepSeek R1, delivering strong reasoning performance in a lightweight architecture.
Specifications
Canonical IDdeepseek-r1-distill-llama-8b
TypeLanguage
StatusActive
CreatorDeepSeekDeepSeek
Providers
Context Window131K tokens
Input ModalitiesText
Output ModalitiesText
Parameters8B
Benchmarks
Intelligence Index
12.1
#356
Math Index
41.3
#146
MMLU-Pro
0.5
#265
GPQA
0.3
#412
HLE
0.0
#365
LiveCodeBench
0.2
#242
AIME
0.3
#67
IFBench
0.2
#386
Time to First Token
SciCode
0.1
#378
MATH-500
0.9
#83
AIME 2025
0.4
#146
LCR
0.0
#356
Output TPS
0.0
#320

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.3 70B Instruct131K$0.100$0.200Available
Llama 3.2 3B Instruct131K$0.015$0.020Deprecated
Llama 3.2 1B Instruct131K$0.027$0.080Deprecated
Llama 3.1 405B Instruct131K$0.120$0.300Deprecating
Llama 3.1 70B Instruct131K$0.100$0.100Available
Llama 3.1 8B Instruct200K$0.020$0.030Available
Llama 3.1 70B128K$0.600$0.600Available
Llama 3.1 8B131K$0.030$0.050Available
Llama 3 70B Instruct131K$0.120$0.300Available
Llama 3 8B Instruct32K$0.030$0.040Available
DeepSeek R1 Distill Llama 8B131K$0.025$0.025Current

Model IDs