DeepSeek logo

DeepSeek-OCR


DeepSeek-OCR is DeepSeek logoDeepSeek's image to text model with a 8K context window and up to 8K output tokens, available from 4 providers, starting at $0.030 / 1M input and $0.030 / 1M output. A multimodal OCR model that compresses long document contexts via optical 2D mapping, combining a DeepEncoder with a compact MoE language model.
Spec
Canonical IDdeepseek-ocr
TypeImage to Text
StatusActive
CreatorDeepSeekDeepSeek
Providers
Context Window8K tokens
Max Output8K tokens
Input ModalitiesImagePdfText
Output ModalitiesText
Parameters3.34B
HuggingFace Likes3,218
HuggingFace Downloads (30d)2,082,348
HuggingFace Downloads (all-time)22,155,059

Capabilities

Input3/5
Text
Image
Audio·
Video·
PDF
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Output
$ / 1M
Hugging Face logo
Hugging Face
novita:deepseek/deepseek-ocr
$0.030$0.030
Novita logo
Novita
novita/deepseek/deepseek-ocr
$0.030$0.030
Google Gemini logo
Google Gemini
deepseek-ocr-maas
$0.300$1.20
Google Vertex AI logo
Google Vertex AI
vertex_ai/deepseek-ai/deepseek-ocr-maas
$0.300$1.20

Cost Calculator

Preset:
Compares every provider & tier in USD

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
DeepSeek-OCR 2Available
DeepSeek-OCR8K$0.030$0.030Current
DeepSeek-OCRAvailable
Document OCRAvailable
Mistral OCR$2.00$3.00Available
OCRAvailable
Prebuilt Document$0.000$0.000Available
Prebuilt Layout$0.000$0.000Available
Prebuilt Read$0.000$0.000Available

Model IDs