Qwen3 4 B Instruct 2507 GGUF Pricing & Specs | AI Models

Qwen3 4B Instruct 2507 GGUF is a text model from Lemonade (AMD) with a context window of 262K tokens and max output of 33K tokens.

Capabilities

✗ Vision✓ Function Calling✗ Reasoning✓ JSON Schema✗ System Messages✗ Web Search✗ Prompt Caching✗ Audio Input✗ Audio Output

Specifications

Model Key	`lemonade/Qwen3-4B-Instruct-2507-GGUF`
Provider	Lemonade (AMD)
Provider ID	lemonade
Mode	Text
Canonical Name	qwen-3-4b-2507
Context Window	262K tokens
Max Output	33K tokens

Pricing

Type	Per 1K Tokens	Per 1M Tokens
Input Tokens	N/A	N/A
Output Tokens	N/A	N/A

Benchmarks

Intelligence Index	12.9#154
Coding Index	9.1#139
Math Index	52.3#53
MMLU-Pro	0.7#127
GPQA	0.5#125
HLE	0.0#121
LiveCodeBench	0.4#84
IFBench	0.3#117
Time to First Token	0.00s#1
SciCode	0.2#168
AIME 2025	0.5#53
LCR	0.1#134
TerminalBench Hard	0.0#107
TAU2	0.3#94

Price Comparison by Provider

Compare prices for Qwen3 4B Instruct 2507 GGUF across different providers. The same model may be available through multiple providers at different price points.

Provider	Model Key	Input Price, $	Output Price, $
Lemonade (AMD)	lemonade/Qwen3-4B-Instruct-2507-GGUF	N/A	N/A
Fireworks AI	fireworks_ai/accounts/fireworks/models/qwen3-4b-instruct-2507	0.200	0.200

All Variants

All available versions, regions, and API endpoints for Qwen3 4B Instruct 2507 GGUF.

Model Key	Provider	Mode	Input Price, $	Output Price, $	Context	Max Output	Vision	Functions
fireworks_ai/accounts/fireworks/models/qwen3-4b-instruct-2507	Fireworks AI	Text	0.200	0.200	262K	262K	no	no
lemonade/Qwen3-4B-Instruct-2507-GGUF	Lemonade (AMD)	Text	N/A	N/A	262K	33K	no	yes

← Back to All Models