MiMo V2.5

Xiaomi's full-modal perception model supporting native understanding of images, videos, audio, and text with 1M context. Agent performance comparable to MiMo V2.5 Pro.

mimo-v2.5
STABLEGet startedView uptime
1,000,000 context
Starting at $0.14/M input tokens
Starting at $0.28/M output tokens
Streaming
Vision
Tools
Reasoning
JSON Output
No ratings yetSign in to rate

Select Provider

All Providers for MiMo V2.5

doteb routes requests to the best providers that are able to handle your prompt size and parameters.

Xiaomi
Context: 1M
Input
$0.14
/M tokens
Cached
$0.028
/M tokens
Output
$0.28
/M tokens
Get started