GLM-4.5 Flash is Zhipu AI's language model with a 128K context window and up to 32K output tokens. A fast, lightweight variant of the GLM-4.5 series from Z AI, optimized for low-latency agentic and tool-use applications.
Specifications
Canonical IDzhipu-glm-4-5-flash
TypeLanguage
StatusActive
CreatorZhipu AIZhipu AI
Context Window128K tokens
Max Output32K tokens
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GLM-4.6V Flash128KAvailable
GLM-4.5 Flash128KCurrent
GLM-4.7 Flash Non-ReasoningAvailable

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
GLM-5.21.0M$0.930$3.00
GLM-5.2 Fast1.0M$3.00$10.25
GLM-5V Turbo203K$1.20$4.00
GLM-5 Turbo262K$1.20$4.00
GLM-5.1 Non-Reasoning
GLM-5 Non-Reasoning
GLM-5 Code200K$1.20$5.00
GLM-5.1 Fast203K$2.80$8.80
GLM-5.1 NVFP4 MTP203K$1.40$4.40
GLM-4.7 FlashX200K$0.060$0.400

Model IDs

zai/glm-4.5-flash
zhipu-glm-4-5-flash