Llama 3.1 8B Instruct FP8 is Meta's language model with a 32K context window, starting at $0.152 / 1M input and $0.287 / 1M output. Llama 3.1 8B instruction-tuned model quantized to FP8 precision for reduced memory footprint and faster inference with minimal accuracy loss.
Specifications
Canonical IDmeta-llama-3-1-8b-instruct-fp8
TypeLanguage
StatusActive
CreatorMetaMeta
Providers
Context Window32K tokens
Input ModalitiesText
Output ModalitiesText

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Input
$ / 1M
Output
$ / 1M
Cloudflare Workers AI logo
Cloudflare Workers AI
@cf/meta/llama-3.1-8b-instruct-fp8
$0.152$0.287

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Llama 3.3 70B Instruct131K$0.100$0.200Available
Llama 3.2 3B Instruct131K$0.015$0.020Deprecated
Llama 3.2 1B Instruct131K$0.027$0.080Deprecated
Llama 3.2 11B128K$0.160$0.160Available
Llama 3.1 405B Instruct131K$0.120$0.300Deprecating
Llama 3.1 70B Instruct131K$0.120$0.300Available
Llama 3.1 8B Instruct200K$0.020$0.030Available
Llama 3.1 70B128K$0.360$0.360Available
Llama 3.1 8B131K$0.030$0.050Available
Llama 3 70B Instruct131K$0.120$0.300Deprecating
Llama 3.1 8B Instruct FP832K$0.152$0.287Current

Model IDs

@cf/meta/llama-3.1-8b-instruct-fp8
meta-llama-3-1-8b-instruct-fp8