QwQ 32B

DeepinfraText

QwQ 32B is a text model from Deepinfra with a context window of 131K tokens and max output of 131K tokens. Pricing starts at $0.15 per million input tokens and $0.40 per million output tokens (cheapest at Deepinfra).

Specifications

Model Keydeepinfra/Qwen/QwQ-32B
ProviderDeepinfra
LiteLLM Providerdeepinfra
ModeText
Canonical Nameqwq-32b
Context Window131K tokens
Max Output131K tokens

Capabilities

Vision Function Calling Reasoning JSON Schema System Messages Web Search Prompt Caching Audio Input Audio Output

Pricing

TypePer 1K TokensPer 1M Tokens
Input Tokens$0.000150$0.150
Output Tokens$0.000400$0.400

Price Comparison by Provider

Compare prices for QwQ 32B across different providers. The same model may be available through multiple providers at different price points.

Provider
Model Key
Input Price
Output Price
Fireworks AIfireworks_ai/accounts/fireworks/models/qwq-32b$0.900$0.900
Deepinfradeepinfra/Qwen/QwQ-32B$0.150$0.400
Hyperbolichyperbolic/Qwen/QwQ-32B$0.200$0.200
Nscalenscale/Qwen/QwQ-32B$0.180$0.200
SambaNovasambanova/QwQ-32B$0.500$1.00

Similar Models

Models with similar capabilities and context window size.

Model
Provider
Mode
Input Price
Output Price
Context
Max Output
Vision
Functions
Gemma 3 27B ItGoogle GeminiTextN/AN/A131K8Kyesyes
GPT-oss-120b-mxfp-GGUFLemonadeTextN/AN/A131K33Knoyes
GPT-oss-20bOpenRouterText$0.020$0.100131K33Knoyes
GPT-oss-20b-mxfp4-GGUFLemonadeTextN/AN/A131K33Knoyes
GPT-oss:120b-cloudOllamaTextN/AN/A131K131Knoyes
GPT-oss:20b-cloudOllamaTextN/AN/A131K131Knoyes
Llama 3.2 3B InstructDeepinfraText$0.020$0.020131K131Knono
Llama3.2 11B Vision InstructLambda AiText$0.015$0.025131K131Kyesyes
Llama3.2 3B InstructLambda AiText$0.015$0.025131K131Knoyes
Mistral Nemo Instruct 2407DeepinfraText$0.020$0.040131K131Knono