InternLM logo

InternVL3 78B


InternVL3 78B is InternLM logoInternLM's language model with a 16K context window and up to 16K output tokens. A 78-billion-parameter multimodal LLM with superior vision-language understanding, built on a ViT-MLP-LLM architecture for complex multimodal reasoning.
Spec
Canonical IDinternlm-internvl-3-78b
TypeLanguage
StatusActive
CreatorInternLMInternLM
Providers
Context Window16K tokens
Max Output16K tokens
Input ModalitiesText
Output ModalitiesText
Parameters78B

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

ProviderStandard
Input
$ / 1M
Output
$ / 1M
Fireworks AI logo
Fireworks AI
$0.900$0.900

Cost Calculator

Preset:
Compares every provider & tier in USD

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
InternVL3 78B16KCurrent
InternVL3 38B16KAvailable
InternVL3 8B16KAvailable
InternVL3 38B16K$0.900$0.900Available
InternVL3 78B16K$0.900$0.900Available
InternVL3 8B16K$0.200$0.200Available

Model IDs