GLM-5.1 NVFP4 MTP is Zhipu AI's language model with a 203K context window, starting at $1.40 / 1M input and $4.40 / 1M output. A GLM-5.1 LLM variant quantized to NVIDIA FP4 precision with multi-token prediction, enabling highly efficient GPU inference throughput.
Specifications
Canonical IDzhipu-glm-5-1-nvfp4-mtp
TypeLanguage
StatusActive
CreatorZhipu AIZhipu AI
Providers
Context Window203K tokens
Input ModalitiesText
Output ModalitiesText
Reasoning Effortsdefault

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities4/13
Reasoning
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Input
$ / 1M
Output
$ / 1M
Tensormesh
tensormesh/lukealonso/GLM-5.1-NVFP4-MTP
$1.40$4.40

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
GLM-5.21.0M$0.980$3.08Available
GLM-5V Turbo203K$1.20$4.00Deprecating
GLM-5 Turbo262K$1.20$4.00Available
GLM-5.1 NVFP4 MTP203K$1.40$4.40Current
GLM-5.1 Non-ReasoningAvailable
GLM-5 Non-ReasoningAvailable
GLM-5 Code200K$1.20$5.00Available
GLM-5.1 Fast203K$2.80$8.80Available
GLM-4.6V Flash128KAvailable
GLM-4 32B128K$0.100$0.100Available
GLM-4.7 FlashX200K$0.060$0.400Available

Model IDs

tensormesh/lukealonso/GLM-5.1-NVFP4-MTP
zhipu-glm-5-1-nvfp4-mtp