Llama3.1 Nemotron 70B Instruct Fp8
Llama3.1 Nemotron 70B Instruct Fp8 is a text model from
Lambda with a context window of 131K tokens and max output of 131K tokens. Pricing starts at 0.12 per million input tokens and 0.30 per million output tokens (cheapest at Lambda).
Capabilities
✗ Vision✓ Function Calling✗ Reasoning✗ JSON Schema✓ System Messages✗ Web Search✗ Prompt Caching✗ Audio Input✗ Audio Output
Specifications
| Model Key | lambda_ai/llama3.1-nemotron-70b-instruct-fp8 |
| Provider | |
| Provider ID | lambda_ai |
| Mode | Text |
| Canonical Name | llama-nemotron-3.1-70b |
| Context Window | 131K tokens |
| Max Output | 131K tokens |
Pricing
| Type | Per 1K Tokens | Per 1M Tokens |
|---|---|---|
| Input Tokens | 0.000120 | 0.120 |
| Output Tokens | 0.000300 | 0.300 |
Price Comparison by Provider
Compare prices for Llama3.1 Nemotron 70B Instruct Fp8 across different providers. The same model may be available through multiple providers at different price points.
Provider | Model Key | Input Price, $ | Output Price, $ |
|---|---|---|---|
| lambda_ai/llama3.1-nemotron-70b-instruct-fp8 | 0.120 | 0.300 | |
| fireworks_ai/accounts/fireworks/models/llama-v3p1-nemotron-70b-instruct | 0.900 | 0.900 | |
| deepinfra/nvidia/Llama-3.1-Nemotron-70B-Instruct | 0.600 | 0.600 |
All Variants
All available versions, regions, and API endpoints for Llama3.1 Nemotron 70B Instruct Fp8.
Model Key | Provider | Mode | Input Price, $ | Output Price, $ | Context | Max Output | Vision | Functions |
|---|---|---|---|---|---|---|---|---|
| deepinfra/nvidia/Llama-3.1-Nemotron-70B-Instruct | Text | 0.600 | 0.600 | 131K | 131K | no | yes | |
| fireworks_ai/accounts/fireworks/models/llama-v3p1-nemotron-70b-instruct | Text | 0.900 | 0.900 | 131K | 131K | no | no | |
| lambda_ai/llama3.1-nemotron-70b-instruct-fp8 | Text | 0.120 | 0.300 | 131K | 131K | no | yes |