Name: Llama 3.1 8B Instant
Brand: Meta

Llama 3.1 8B Instant is

Meta's language model with a 128K context window and up to 8K output tokens, starting at $0.050 / 1M input and $0.080 / 1M output. A low-latency serving variant of the 8B Llama 3.1 model optimized for fast response times in real-time conversational and API applications.

Spec
Canonical ID	`meta-llama-3-1-8b-instant`
Type	Language
Status	Active
Creator	Meta
Providers	Groq
Context Window	128K tokens
Max Output	8K tokens
Input Modalities	Text
Output Modalities	Text
Parameters	8B

Capabilities

Input1/5

Text✓

Image·

Audio·

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities1/13

Reasoning·

Adaptive Reasoning·

Function Calling✓

Parallel Function Calling·

Structured Outputs·

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Pricing by Provider

Provider	Standard
Provider	Input $ / 1M	Output $ / 1M
Groq llama-3.1-8b-instant	$0.050	$0.080

Cost Calculator

Preset:

Input tokens

Output tokens

Number of calls

Compares every provider & tier in USD

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
Llama 3.3 70B Instruct	2024-12-06	131K	$0.720	$0.720	Available
Llama 3.3 70B Instruct	2024-12-06	131K	$0.100	$0.300	Available
Llama 3.3	—	—	—	—	Available
Llama 3.3 70B Instruct Turbo	—	131K	$0.130	$0.390	Available
Llama 3.3 70B Versatile	—	128K	$0.590	$0.790	Available
Llama 3.3 8B Instruct	—	128K	—	—	Available
Llama 3.2 11B Vision Instruct	2024-09-25	128K	$0.160	$0.160	Available
Llama 3.2 1B Instruct	2024-09-25	128K	$0.027	$0.080	Deprecated
Llama 3.2 3B Instruct	2024-09-25	131K	$0.015	$0.020	Deprecated
Llama 3.2 90B Vision Instruct	2024-09-25	128K	$0.720	$0.720	Available
Llama 3.1 8B Instant	—	128K	$0.050	$0.080	Current

Llama 3.1 8B Instant

Capabilities

Pricing by Provider

Cost Calculator

Versions

Model IDs