Name: DeepSeek-OCR
Brand: DeepSeek

DeepSeek-OCR is DeepSeek's image to text model with a 8K context window and up to 8K output tokens, available from 4 providers, starting at $0.030 / 1M input and $0.030 / 1M output. A multimodal OCR model that compresses long document contexts via optical 2D mapping, combining a DeepEncoder with a compact MoE language model.

Specifications
Canonical ID	`deepseek-ocr`
Type	Image to Text
Status	Active
Creator	DeepSeek
Providers	Google Gemini Google Vertex AI Hugging Face Novita AI
Context Window	8K tokens
Max Output	8K tokens
Input Modalities	ImagePdfText
Output Modalities	Text
Parameters	3.34B
HuggingFace Likes	3,218
HuggingFace Downloads (30d)	2,082,348
HuggingFace Downloads (all-time)	22,155,059

Capabilities

Input3/5

Text✓

Image✓

Audio·

Video·

PDF✓

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities1/13

Reasoning·

Adaptive Reasoning·

Function Calling·

Parallel Function Calling·

Structured Outputs✓

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Pricing by Provider

Provider	Standard
Provider	Input $ / 1M	Output $ / 1M
Hugging Face novita:deepseek/deepseek-ocr	$0.030	$0.030
Novita novita/deepseek/deepseek-ocr	$0.030	$0.030
Google Gemini deepseek-ocr-maas	$0.300	$1.20
Google Vertex AI vertex_ai/deepseek-ai/deepseek-ocr-maas	$0.300	$1.20

Cost Calculator

Preset:

Input tokens

Output tokens

Number of calls

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
DeepSeek-OCR 2	—	—	—	—	Available
Qianfan OCR Fast	2026-04-20	66K	—	—	Deprecated
DeepSeek-OCR	—	8K	$0.030	$0.030	Current
DeepSeek-OCR	—	—	—	—	Available
Document OCR	—	—	—	—	Available
Mistral OCR	—	—	$2.00	$3.00	Available
OCR	—	—	—	—	Available
Prebuilt Document	—	—	—	—	Available
Prebuilt Layout	—	—	—	—	Available
Prebuilt Read	—	—	—	—	Available

DeepSeek-OCR

Capabilities

Pricing by Provider

Cost Calculator

Versions

Model IDs