Name: DeepSeek V4 Flash
Brand: DeepSeek

DeepSeek V4 Flash is DeepSeek's language model with a 1.0M context window and up to 384K output tokens, available from 11 providers, starting at $0.0938 / 1M input and $0.188 / 1M output. An efficiency-optimized Mixture-of-Experts LLM from DeepSeek with 284B total and 13B activated parameters, supporting a 1M-token context window with reasoning and tool-use capabilities.

Specifications
Canonical ID	`deepseek-v4-flash`
Type	Language
Status	Active
Creator	DeepSeek
Providers	Alibaba Qwen Azure AI Foundry DeepSeek Fireworks AI Hugging Face OpenRouter Libertai Pinstripes Tencent Tensormesh Vercel AI Gateway
Context Window	1.0M tokens
Max Output	384K tokens
Input Modalities	ImagePDFText
Output Modalities	Text
Reasoning Efforts	default
Parameters	158B
HuggingFace Likes	649
HuggingFace Downloads (30d)	25,391
HuggingFace Downloads (all-time)	25,391
Release Date	2026-04-24 · 3 months ago

Benchmarks
Intelligence Index	40.3 #34
Coding Index	56.2 #32
GPQA	0.9 #32
HLE	0.3 #37
IFBench	0.8 #9
Time to First Token	0.86s #431
SciCode	0.4 #54
LCR	0.6 #90
TerminalBench Hard	0.4 #57
TAU2	1.0 #25
Output TPS	117.7 #56

Capabilities

Input3/5

Text✓

Image✓

Audio·

Video·

PDF✓

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities7/13

Reasoning✓

Adaptive Reasoning·

Function Calling✓

Parallel Function Calling✓

Structured Outputs✓

Native JSON Schema✓

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching✓

Assistant Prefill✓

Pricing by Provider

US Dollar ($)

Per 1M tokens

Provider	Standard			Batch
Provider	Input $ / 1M	Output $ / 1M	Cache Read $ / 1M	Input $ / 1M	Output $ / 1M	Cache Read $ / 1M
Alibaba Qwen `deepseek-v4-flash`	$0.2	$0.4	N/A	$0.1	$0.2	N/A
Azure AI Foundry `azure_ai/deepseek-v4-flash`	$0.19	$0.51	N/A	—	—	—
DeepSeek `deepseek-v4-flash(1)`	$0.14	$0.28	$0.0028	$0.07	$0.14	$0.0014
Fireworks AI `fireworks_ai/deepseek-v4-flash`	$0.14	$0.28	$0.028	—	—	—
Hugging Face `novita:deepseek/deepseek-v4-flash`	$0.14	$0.28	N/A	—	—	—
Libertai `libertai/deepseek-v4-flash`	$0.25	$1.75	N/A	—	—	—
OpenRouter `deepseek/deepseek-v4-flash`	$0.0938	$0.188	$0.0188	—	—	—
Pinstripes `pinstripes/ps/deepseek-v4-flash`	$0.1	$0.2	N/A	—	—	—
Tencent `tencent/deepseek-v4-flash`	$0.14	$0.28	$0.0028	—	—	—
Tensormesh `tensormesh/deepseek-ai/DeepSeek-V4-Flash`	$0.14	$0.28	N/A	—	—	—
Vercel AI Gateway `deepseek/deepseek-v4-flash`	$0.14	$0.28	$0.028	—	—	—

Cost Calculator

US Dollar ($)

Preset:

Input tokens

Output tokens

Cache write tokens

Cache read tokens

Number of calls

Cheapest Instances to Run It

Cloud GPU instances that can host DeepSeek V4 Flash, ranked by cheapest on-demand price. The model needs about 379 GB of GPU memory at FP16 precision (estimated from its parameter count), so treat the fit as guidance rather than a guarantee.

All clouds

FP16 (full precision)

US Dollar ($)

Instance	Cloud	GPU	VRAM	Price	Cheapest region
g7e.24xlarge	AWS	4× RTX PRO Server 6000	384 GB	$16.57/hr	us-east-1
p4de.24xlarge	AWS	8× A100	640 GB	$27.45/hr	us-east-1
Standard_ND96amsr_A100_v4	Azure	8× NVIDIA A100 (80GB)	640 GB	$32.77/hr	westus2
7 more instances can run DeepSeek V4 Flash Unlock the full ranked list and FP8 / INT4 quantization with a CloudPrice subscription.

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
DeepSeek V4 Flash	2026-04-24	1.0M	$0.094	$0.188	Current
DeepSeek V4 Flash Thinking	—	200K	$0.250	$1.75	Available

Other Models

Model	Tier	Released	Context	Input / 1M	Output / 1M
DeepSeek V4 Pro	Pro	2026-04-24	1.0M	$0.435	$0.870

Model IDs

accounts/fireworks/models/deepseek-v4-flash

azure_ai/deepseek-v4-flash

deepseek-ai/DeepSeek-V4-Flash

deepseek-v4-flash

deepseek-v4-flash-high

deepseek-v4-flash-non-reasoning

deepseek-v4-flash(1)

deepseek-v4-flash*

deepseek/deepseek-v4-flash

deepseek/deepseek-v4-flash:free

fireworks_ai/accounts/fireworks/models/deepseek-v4-flash

fireworks_ai/deepseek-v4-flash

libertai/deepseek-v4-flash

pinstripes/ps/deepseek-v4-flash

tencent/deepseek-v4-flash

tensormesh/deepseek-ai/DeepSeek-V4-Flash

DeepSeek V4 Flash

CapabilitiesAPIGET/api/v1/models/deepseek-v4-flash

Pricing by ProviderAPIGET/api/v1/models/deepseek-v4-flash/pricing

Cost CalculatorAPIGET/api/v1/models/deepseek-v4-flash/pricing/calculate?input_tokens=1000000&output_tokens=500000

Cheapest Instances to Run ItAPIGET/api/v1/models/deepseek-v4-flash/instances

VersionsAPIGET/api/v1/models?family=v4

Other ModelsAPIGET/api/v1/models/deepseek-v4-flash/similar

Model IDsAPIGET/api/v1/models/deepseek-v4-flash

Capabilities

Pricing by Provider

Cost Calculator

Cheapest Instances to Run It

Versions

Other Models

Model IDs