GLM-OCR is Zhipu AI's image to text model. A lightweight 0.9B-parameter OCR model from Z.AI achieving top-tier document recognition performance on OmniDocBench, optimized for real-world business document processing scenarios.
Specifications
Canonical IDzhipu-glm-ocr
TypeImage to Text
StatusActive
CreatorZhipu AIZhipu AI
Input ModalitiesText
Output ModalitiesText
Parameters1.32B
HuggingFace Likes1,806
HuggingFace Downloads (30d)4,468,881
HuggingFace Downloads (all-time)20,912,030

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GLM-5V Turbo200K$1.20$4.00Available
GLM-5 Turbo262K$1.20$4.00Available
GLM-5.1 Non-ReasoningAvailable
GLM-5 Non-ReasoningAvailable
GLM-5 Code200K$1.20$5.00Available
GLM-4.6V Flash128KAvailable
GLM-4 32B128K$0.100$0.100Available
GLM-4.7 FlashX200K$0.060$0.400Available
GLM-4.7 Non-ReasoningAvailable
GLM-4.6 ReasoningAvailable
GLM-OCRCurrent

Model IDs

zai-org/glm-ocr
zhipu-glm-ocr