LLaVA 7B is Haotian Liu's language model with a 4K context window and up to 2K output tokens. A 7B-parameter multimodal LLM combining a vision encoder with a language model for visual question answering and image-text understanding.
haotian-liu-llava-7b |
| Language |
| Active |
| Haotian Liu |
| 4K tokens |
| 2K tokens |
| Image |
| Text |
| 7B |
Capabilities
Input1/5
·
✓
·
·
·
Output1/5
✓
·
·
·
·
Capabilities1/13
·
·
·
·
✓
·
·
·
·
·
·
·
·
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| LLaVA 1.6 7B Mistral | — | 32K | $0.290 | $0.290 | Available |
| LLaVA 7B | — | 4K | — | — | Current |
| LLaVA Yi 34B | — | 4K | $0.900 | $0.900 | Available |