DeepSeek V4 Flash is DeepSeek's language model with a 1.0M context window and up to 384K output tokens, available from 11 providers, starting at $0.09 / 1M input and $0.18 / 1M output. An efficiency-optimized Mixture-of-Experts LLM from DeepSeek with 284B total and 13B activated parameters, supporting a 1M-token context window with reasoning and tool-use capabilities.
Specifications
Canonical IDdeepseek-v4-flash
TypeLanguage
StatusActive
CreatorDeepSeekDeepSeek
Providers
Context Window1.0M tokens
Max Output384K tokens
Input ModalitiesImagePDFText
Output ModalitiesText
Reasoning Effortsdefault
Parameters158B
HuggingFace Likes649
HuggingFace Downloads (30d)25,391
HuggingFace Downloads (all-time)25,391
Release Date · 2 months ago
Benchmarks
Intelligence Index
40.3
#23
Coding Index
56.2
#19
GPQA
0.9
#23
HLE
0.3
#28
IFBench
0.8
#9
Time to First Token
1.01s
#354
SciCode
0.4
#44
LCR
0.6
#79
TerminalBench Hard
0.4
#55
TAU2
1.0
#25
Output TPS
94.3
#130

Capabilities

Input3/5
Text
Image
Audio·
Video·
PDF
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities7/13
Reasoning
Adaptive Reasoning·
Function Calling
Parallel Function Calling
Structured Outputs
Native JSON Schema
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching
Assistant Prefill

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandardBatch
Input
$ / 1M
Output
$ / 1M
Cache Read
$ / 1M
Input
$ / 1M
Output
$ / 1M
Cache Read
$ / 1M
Alibaba Qwen logo
Alibaba Qwen
deepseek-v4-flash
$0.2$0.4N/A$0.1$0.2N/A
Azure AI Foundry logo
Azure AI Foundry
azure_ai/deepseek-v4-flash
$0.19$0.51N/A
DeepSeek logo
DeepSeek
deepseek-v4-flash(1)
$0.14$0.28$0.0028$0.07$0.14$0.0014
Fireworks AI logo
Fireworks AI
fireworks_ai/deepseek-v4-flash
$0.14$0.28$0.028
Hugging Face logo
Hugging Face
novita:deepseek/deepseek-v4-flash
$0.14$0.28N/A
Libertai
libertai/deepseek-v4-flash
$0.25$1.75N/A
OpenRouter logo
OpenRouter
deepseek/deepseek-v4-flash
$0.09$0.18$0.018
Pinstripes
pinstripes/ps/deepseek-v4-flash
$0.1$0.2N/A
Tencent logo
Tencent
tencent/deepseek-v4-flash
$0.14$0.28$0.0028
Tensormesh
tensormesh/deepseek-ai/DeepSeek-V4-Flash
$0.14$0.28N/A
Vercel AI Gateway logo
Vercel AI Gateway
deepseek/deepseek-v4-flash
$0.14$0.28$0.0028

Cost Calculator

US Dollar ($)
Preset:

Cheapest Instances to Run It

Cloud GPU instances that can host DeepSeek V4 Flash, ranked by cheapest on-demand price. The model needs about 379 GB of GPU memory at FP16 precision (estimated from its parameter count), so treat the fit as guidance rather than a guarantee.

All clouds
FP16 (full precision)
US Dollar ($)
Instance
Cloud
GPU
VRAM
Price
Cheapest region
Standard_NCC40ads_H100_v5AzureNVIDIA H100752 GB$6.98/hreastus2
g7e.24xlargeAWS4× RTX PRO Server 6000384 GB$16.57/hrus-east-1
p4de.24xlargeAWS8× A100640 GB$27.45/hrus-east-1
7 more instances can run DeepSeek V4 Flash
Unlock the full ranked list and FP8 / INT4 quantization with a CloudPrice subscription.

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
DeepSeek V4 Flash1.0M$0.090$0.180Current
DeepSeek V4 Flash Thinking200K$0.250$1.75Available

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
DeepSeek V4 ProPro1.0M$0.435$0.870

Model IDs

accounts/fireworks/models/deepseek-v4-flash
azure_ai/deepseek-v4-flash
deepseek-ai/DeepSeek-V4-Flash
deepseek-v4-flash
deepseek-v4-flash-high
deepseek-v4-flash-non-reasoning
deepseek-v4-flash(1)
deepseek-v4-flash*
deepseek/deepseek-v4-flash
deepseek/deepseek-v4-flash:free
fireworks_ai/accounts/fireworks/models/deepseek-v4-flash
fireworks_ai/deepseek-v4-flash
libertai/deepseek-v4-flash
pinstripes/ps/deepseek-v4-flash
tencent/deepseek-v4-flash
tensormesh/deepseek-ai/DeepSeek-V4-Flash