Meta logo

Llama 3.1 8B Instant


Llama 3.1 8B Instant is Meta logoMeta's language model with a 128K context window and up to 8K output tokens, starting at $0.050 / 1M input and $0.080 / 1M output. A low-latency serving variant of the 8B Llama 3.1 model optimized for fast response times in real-time conversational and API applications.
Spec
Canonical IDmeta-llama-3-1-8b-instant
TypeLanguage
StatusActive
CreatorMetaMeta
Providers
Context Window128K tokens
Max Output8K tokens
Input ModalitiesText
Output ModalitiesText
Parameters8B

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning·
Adaptive Reasoning·
Function Calling
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Output
$ / 1M
Groq logo
Groq
llama-3.1-8b-instant
$0.050$0.080

Cost Calculator

Preset:
Compares every provider & tier in USD

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.3 70B Instruct131K$0.720$0.720Available
Llama 3.3 70B Instruct131K$0.100$0.300Available
Llama 3.3Available
Llama 3.3 70B Instruct Turbo131K$0.130$0.390Available
Llama 3.3 70B Versatile128K$0.590$0.790Available
Llama 3.3 8B Instruct128KAvailable
Llama 3.2 11B Vision Instruct128K$0.160$0.160Available
Llama 3.2 1B Instruct128K$0.027$0.080Deprecated
Llama 3.2 3B Instruct131K$0.015$0.020Deprecated
Llama 3.2 90B Vision Instruct128K$0.720$0.720Available
Llama 3.1 8B Instant128K$0.050$0.080Current

Model IDs