ERNIE 4.5 VL 424B A47B is Baidu's language model with a 131K context window and up to 16K output tokens, available from 3 providers, starting at $0.42 / 1M input and $1.25 / 1M output. A large-scale 424B multimodal MoE vision-language model from Baidu activating 47B parameters per token for cross-modal knowledge fusion.
Capabilities
Input2/5
Text✓
Image✓
Audio·
Video·
PDF·
Output1/5
Text✓
Image·
Audio·
Video·
Embedding·
Capabilities1/13
Reasoning✓
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·
Pricing by Provider
US Dollar ($)
Per 1M tokens
| Provider | Standard | |
|---|---|---|
| Input $ / 1M | Output $ / 1M | |
| $0.42 | $1.25 | |
| $0.42 | $1.25 | |
| $0.42 | $1.25 | |
Cost Calculator
US Dollar ($)
Preset:
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| ERNIE 5 Thinking Preview | — | — | — | — | Available |
| ERNIE 4.5 21B A3B Thinking | 131K | $0.070 | $0.280 | Available | |
| ERNIE 4.5 VL 424B A47B | 131K | $0.420 | $1.25 | Current | |
| ERNIE 4.5 300B A47B | 131K | — | — | Available | |
| ERNIE 4.5 300B A47B Paddle | — | 123K | $0.280 | $1.10 | Available |
| ERNIE 4.5 VL 28B A3B Thinking | — | 131K | $0.390 | $0.390 | Available |
| ERNIE Image | — | — | — | — | Available |
| ERNIE Image Turbo | — | — | — | — | Available |