GLM-4.5 Flash is Zhipu AI's language model with a 128K context window and up to 32K output tokens. A fast, lightweight variant of the GLM-4.5 series from Z AI, optimized for low-latency agentic and tool-use applications.
Specifications
Canonical IDzhipu-glm-4-5-flash
TypeLanguage
StatusActive
CreatorZhipu AIZhipu AI
Context Window128K tokens
Max Output32K tokens
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GLM-4.6V Flash128KAvailable
GLM-4.5 Flash128KCurrent
GLM-4.7 Flash Non-ReasoningAvailable

Other models

ModelTierReleasedContextInput / 1MOutput / 1M
GLM-5V Turbo203K$1.20$4.00
GLM-5 Turbo203K$1.20$4.00
GLM-5.1 Non-Reasoning
GLM-5 Non-Reasoning
GLM-5 Code200K$1.20$5.00
GLM-4 32B128K$0.100$0.100
GLM-4.7 FlashX200K$0.060$0.400
GLM-4.7 Non-Reasoning
GLM-4.6 Reasoning
GLM-4.6V Reasoning

Model IDs