Question 1

What is the cheapest AI model for chat?

Accepted Answer

The cheapest AI chat models are typically smaller models like GPT-3.5-turbo, Claude Instant, or open-source models on platforms like Together AI and Groq. Prices can be as low as $0.10-0.25 per million tokens. However, cheaper models may have reduced capabilities compared to flagship models like GPT-4 or Claude 3 Opus. Use CloudPrice to compare current prices across all providers.

Question 2

How much does GPT-4 cost?

Accepted Answer

GPT-4 pricing varies by version. GPT-4 Turbo costs approximately $10 per million input tokens and $30 per million output tokens. GPT-4o is more affordable at around $5/$15 per million tokens. OpenAI also offers GPT-4o-mini at significantly lower prices. Pricing may change, so check CloudPrice for current rates.

Question 3

What is the difference between input and output tokens?

Accepted Answer

Input tokens are the text you send to the AI model (your prompts, context, and conversation history). Output tokens are what the model generates in response. Most AI providers charge differently for input and output tokens, with output tokens typically being 2-4x more expensive because they require more computation to generate.

Question 4

Which AI model has the largest context window?

Accepted Answer

Google's Gemini 1.5 Pro currently offers one of the largest context windows at 1 million tokens. Anthropic's Claude 3 models support 200K tokens, while OpenAI's GPT-4 Turbo supports 128K tokens. Larger context windows allow processing more text in a single request but may cost more.

Question 5

What does 'supports vision' mean for AI models?

Accepted Answer

AI models with vision support can process and understand images in addition to text. You can send images as part of your prompt and the model can describe, analyze, or answer questions about them. Examples include GPT-4 Vision, Claude 3, and Gemini Pro Vision. Vision capabilities typically cost extra per image processed.

Model	Creator	Input Price, $	Output Price, $	Context	Max Output	Inference Providers	Intelligence	Coding
Kimi K2.5	Moonshot AI (Kimi)	0.375	2.02	262K	98K	compare (12)	38.1#1	N/A
MiniMax M2.5	MiniMax	0.12	0.48	1.0M	197K	compare (8)	33.7#2	N/A
GPT OSS 120B	OpenAI	0.03	0.15	131K	131K	compare (23)	23.8#3	30.4#1
DeepSeek V3.1	DeepSeek	0.27	1.00	164K	33K	compare (14)	21.0#4	N/A
GLM-4.5	Zhipu AI	0.4	1.60	131K	98K	compare (7)	19.5#5	N/A
Qwen3 Coder 480B A35B Instruct	Alibaba	0.22	1.30	262K	66K	compare (8)	18.0#6	N/A
DeepSeek V3 324	DeepSeek	0.2	0.4	164K	16K	compare (13)	15.7#7	N/A
GPT OSS 20B	OpenAI	0.0145	0.07	131K	131K	compare (18)	14.9#8	20.7#2
Qwen3 235B A22B Instruct	Alibaba	0.09	0.58	262K	16K	compare (11)	10.9#9	N/A
Llama 3.3 70B Instruct	Meta	0.1	0.2	131K	120K	compare (21)	8.6#10	N/A
Llama 3.1 8B Instruct	Meta	0.02	0.03	200K	128K	compare (21)	6.1#11	N/A
DeepSeek R1 528	DeepSeek	0.2	0.25	164K	33K	compare (13)	N/A	N/A
Kimi K2 Instruct	Moonshot AI (Kimi)	0.5	2.00	262K	33K	compare (9)	N/A	N/A
Llama 4 17B Scout Instruct	Meta	0.05	0.1	10.0M	16K	compare (12)	N/A	N/A
Phi-4 Mini Instruct	Microsoft	0.075	0.3	131K	128K	compare (3)	N/A	N/A
Qwen3 235B A22B Thinking	Alibaba	0.1	0.1	262K	33K	compare (9)	N/A	N/A

AI Model Comparison