FireLLaVA 13B is Fireworks AI's language model with a 4K context window and up to 4K output tokens, starting at $0.200 / 1M input and $0.200 / 1M output. A 13B-parameter multimodal vision-language model from Fireworks AI built on the LLaVA architecture for image understanding and visual question answering.
Specifications
Canonical IDfireworks-ai-firellava-13b
TypeLanguage
StatusActive
CreatorFireworks AIFireworks AI
Providers
Context Window4K tokens
Max Output4K tokens
Input ModalitiesText
Output ModalitiesText
Parameters13B

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Output
$ / 1M
Fireworks AI logo
Fireworks AI
fireworks_ai/accounts/fireworks/models/firellava-13b
$0.200$0.200

Cost Calculator

Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
FireLLaVA 13B4K$0.200$0.200Current
LLaVA Yi 34B4K$0.900$0.900Available

Model IDs