Molmo 7B-D is Allen AI's language model. A 7B open vision-language model from the Allen Institute for AI, featuring strong visual grounding and pointing capabilities in a mid-size architecture.
Specifications
Canonical IDallenai-molmo-7b-d
TypeLanguage
StatusActive
CreatorAllen AIAllen AI
Input ModalitiesText
Output ModalitiesText
Parameters7B
Benchmarks
Intelligence Index
9.2
#408
Coding Index
1.2
#378
Math Index
0.0
#258
MMLU-Pro
0.4
#304
GPQA
0.2
#438
HLE
0.1
#268
LiveCodeBench
0.0
#316
IFBench
0.2
#386
Time to First Token
SciCode
0.0
#426
AIME 2025
0.0
#258
LCR
0.0
#347
TerminalBench Hard
0.0
#342
TAU2
0.0
#363
Output TPS
0.0
#285

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Model IDs