Rolm OCR is Rolm's image to text model with a 128K context window, starting at $0.2 / 1M input and $0.2 / 1M output. An open-source document OCR model built on Qwen2.5-VL-7B-Instruct by Reducto AI, offering faster performance and reduced memory usage as a drop-in alternative to olmOCR.
| Specifications | |
|---|---|
rolm-ocr | |
| Image to Text | |
| Active | |
| Rolm | |
| 128K tokens | |
| Text | |
| Text | |
Capabilities
Input1/5
Text✓
Image·
Audio·
Video·
PDF·
Output1/5
Text✓
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·
Pricing by Provider
US Dollar ($)
Per 1M tokens
| Provider | Standard | |
|---|---|---|
| Input $ / 1M | Output $ / 1M | |
| $0.2 | $0.2 | |
Cost Calculator
US Dollar ($)
Preset:
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| Voyage Multimodal 3.5 | — | — | — | — | Available |
| Qwen2.5 VL 72B Instruct | 131K | $0.130 | $0.400 | Available | |
| Rolm OCR | — | 128K | $0.200 | $0.200 | Current |
| Qwen2.5 VL 32B Instruct | — | 131K | $0.200 | $0.600 | Available |
| Qwen2.5 VL 3B Instruct | — | 131K | $0.200 | $0.200 | Available |
| Qwen2.5 VL 7B Instruct | — | 131K | $0.200 | $0.200 | Available |