GLM-4.5 Air FP8 is Zhipu AI's language model with a 128K context window, starting at $0.2 / 1M input and $1.10 / 1M output. An FP8-quantized version of the GLM-4.5 Air MoE model, optimized for memory-efficient deployment while preserving agentic reasoning capabilities.
Specifications
Canonical IDzhipu-glm-4-5-air-fp8
TypeLanguage
StatusActive
CreatorZhipu AIZhipu AI
Providers
Context Window128K tokens
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities3/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling
Structured Outputs
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Input
$ / 1M
Output
$ / 1M
Together AI logo
Together AI
together_ai/zai-org/GLM-4.5-Air-FP8
$0.2$1.10

Cost Calculator

US Dollar ($)
Preset:

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
GLM-5.21.0M$1.20$4.10
GLM-5V Turbo200K$1.20$4.00
GLM-5 Turbo262K$1.20$4.00
GLM-5.1 Non-Reasoning
GLM-5 Non-Reasoning
GLM-5 Code200K$1.20$5.00
GLM-4.6V FlashFlash128K
GLM-4 32B128K$0.100$0.100
GLM-4.7 FlashX200K$0.060$0.400
GLM-4.7 Non-Reasoning

Model IDs

together_ai/zai-org/GLM-4.5-Air-FP8
zhipu-glm-4-5-air-fp8