Name: Qwen2.5 72B Instruct
Brand: Alibaba

Qwen2.5 72B Instruct is Alibaba's language model with a 131K context window and up to 16K output tokens, available from 7 providers, starting at $0.12 / 1M input and $0.3 / 1M output. A 72-billion-parameter instruction-tuned LLM from Alibaba's Qwen2.5 series, excelling at natural language understanding, summarization, and dialogue.

Specifications
Canonical ID	`alibaba-qwen2-5-72b-instruct`
Type	Language
Status	Active
Creator	Alibaba
Providers	DeepInfra Fireworks AI Hugging Face Hyperbolic Nebius Novita OpenRouter
Context Window	131K tokens
Max Output	16K tokens
Input Modalities	Text
Output Modalities	Text
Parameters	72B
HuggingFace Likes	927
HuggingFace Downloads (30d)	457,915
HuggingFace Downloads (all-time)	5,817,981
Release Date	2024-09-19 · 2 years ago
Knowledge Cutoff	2024-06-30 · 2 years ago

Benchmarks
Intelligence Index	9.6 #301
Math Index	14.0 #213
MMLU-Pro	0.7 #190
GPQA	0.5 #341
HLE	0.0 #385
LiveCodeBench	0.3 #223
AIME	0.2 #98
IFBench	0.4 #272
Time to First Token	0.00s #9
SciCode	0.3 #294
MATH-500	0.9 #81
AIME 2025	0.1 #213
LCR	0.2 #265
TerminalBench Hard	0.0 #259
TAU2	0.3 #215
Output TPS	0.0 #265

Capabilities

Input1/5

Text✓

Image·

Audio·

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities4/13

Reasoning·

Adaptive Reasoning·

Function Calling✓

Parallel Function Calling✓

Structured Outputs✓

Native JSON Schema✓

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Pricing by Provider

US Dollar ($)

Per 1M tokens

Provider	Standard
Provider	Input $ / 1M	Output $ / 1M
DeepInfra `deepinfra/Qwen/Qwen2.5-72B-Instruct`	$0.12	$0.39
Fireworks AI `fireworks_ai/accounts/fireworks/models/qwen2p5-72b-instruct`	$0.9	$0.9
Hugging Face `novita:qwen/qwen-2.5-72b-instruct`	$0.38	$0.4
Hyperbolic `hyperbolic/Qwen/Qwen2.5-72B-Instruct`	$0.12	$0.3
Nebius `nebius/Qwen/Qwen2.5-72B-Instruct`	$0.13	$0.4
Novita `novita/qwen/qwen-2.5-72b-instruct`	$0.38	$0.4
OpenRouter `qwen/qwen-2.5-72b-instruct`	$0.36	$0.4

Cost Calculator

US Dollar ($)

Preset:

Input tokens

Output tokens

Number of calls

Cheapest Instances to Run It

Cloud GPU instances that can host Qwen2.5 72B Instruct, ranked by cheapest on-demand price. The model needs about 173 GB of GPU memory at FP16 precision (estimated from its parameter count), so treat the fit as guidance rather than a guarantee.

All clouds

FP16 (full precision)

US Dollar ($)

Instance	Cloud	GPU	VRAM	Price	Cheapest region
Standard_NP40s	Azure	4× AMD Alveo U250 FPGA (64GB)	256 GB	$6.60/hr	westus2
g2-standard-96	GCP	8× nvidia-l4	192 GB	$7.98/hr	us-east4
g7e.12xlarge	AWS	2× RTX PRO Server 6000	192 GB	$8.29/hr	us-east-1
7 more instances can run Qwen2.5 72B Instruct Unlock the full ranked list and FP8 / INT4 quantization with a CloudPrice subscription.

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
Dolphin 2.9.2 Qwen2 72B	—	131K	$0.900	$0.900	Available
Cogito V1 Preview Qwen 14B	—	131K	$0.200	$0.200	Available
Cogito V1 Preview Qwen 32B	—	131K	$0.900	$0.900	Available
QwQ 32B	2025-03-05	131K	$0.150	$0.200	Deprecated
Qwen2.5 Coder 32B Instruct	2024-11-11	131K	$0.050	$0.100	Available
Qwen2.5 7B Instruct	2024-10-16	131K	$0.040	$0.070	Available
Qwen2.5 72B Instruct	2024-09-19	131K	$0.120	$0.300	Current
QwQ 32B Preview	—	33K	$0.900	$0.900	Available
Qwen2.5 32B Instruct	—	128K	$0.060	$0.200	Available
Qwen2 72B Instruct	—	33K	$0.900	$0.900	Available
Qwen2.5 Coder 7B Instruct	—	33K	$0.010	$0.030	Available

Model IDs

accounts/fireworks/models/qwen2p5-72b-instruct

alibaba-qwen2-5-72b-instruct

deepinfra/Qwen/Qwen2.5-72B-Instruct

fireworks_ai/accounts/fireworks/models/qwen2p5-72b-instruct

huggingface-llm-qwen2-5-72b-instruct

hyperbolic/Qwen/Qwen2.5-72B-Instruct

nebius/Qwen/Qwen2.5-72B-Instruct

novita/qwen/qwen-2.5-72b-instruct

qwen/qwen-2.5-72b-instruct

Qwen/Qwen2.5-72B-Instruct

qwen2-5-72b-instruct

qwen2.5-72b-instruct

Qwen2.5 72B Instruct

CapabilitiesAPIGET/api/v1/models/alibaba-qwen2-5-72b-instruct

Pricing by ProviderAPIGET/api/v1/models/alibaba-qwen2-5-72b-instruct/pricing

Cost CalculatorAPIGET/api/v1/models/alibaba-qwen2-5-72b-instruct/pricing/calculate?input_tokens=1000000&output_tokens=500000

Cheapest Instances to Run ItAPIGET/api/v1/models/alibaba-qwen2-5-72b-instruct/instances

VersionsAPIGET/api/v1/models?family=qwen2

Model IDsAPIGET/api/v1/models/alibaba-qwen2-5-72b-instruct

Capabilities

Pricing by Provider

Cost Calculator

Cheapest Instances to Run It

Versions

Model IDs