Qwen3 4B Instruct 2507 GGUF

Qwen3 4B Instruct 2507 GGUF is a text model from Lemonade (AMD) with a context window of 262K tokens and max output of 33K tokens.

Capabilities

Vision Function Calling Reasoning JSON Schema System Messages Web Search Prompt Caching Audio Input Audio Output

Specifications

Model Keylemonade/Qwen3-4B-Instruct-2507-GGUF
ProviderLemonade (AMD)
Provider IDlemonade
ModeText
Canonical Nameqwen-3-4b-2507
Context Window262K tokens
Max Output33K tokens

Pricing

TypePer 1K TokensPer 1M Tokens
Input TokensN/AN/A
Output TokensN/AN/A

Benchmarks

Intelligence Index12.9#154
Coding Index9.1#139
Math Index52.3#53
MMLU-Pro0.7#127
GPQA0.5#125
HLE0.0#121
LiveCodeBench0.4#84
IFBench0.3#117
Time to First Token0.00s#1
SciCode0.2#168
AIME 20250.5#53
LCR0.1#134
TerminalBench Hard0.0#107
TAU20.3#94

Price Comparison by Provider

Compare prices for Qwen3 4B Instruct 2507 GGUF across different providers. The same model may be available through multiple providers at different price points.

Provider
Model Key
Input Price, $
Output Price, $
Lemonade (AMD)lemonade/Qwen3-4B-Instruct-2507-GGUFN/AN/A
Fireworks AIfireworks_ai/accounts/fireworks/models/qwen3-4b-instruct-25070.2000.200

All Variants

All available versions, regions, and API endpoints for Qwen3 4B Instruct 2507 GGUF.

Model Key
Provider
Mode
Input Price, $
Output Price, $
Context
Max Output
Vision
Functions
fireworks_ai/accounts/fireworks/models/qwen3-4b-instruct-2507Fireworks AIText0.2000.200262K262Knono
lemonade/Qwen3-4B-Instruct-2507-GGUFLemonade (AMD)TextN/AN/A262K33Knoyes