ResNet 18 is Microsoft's image to text model. A lightweight 18-layer residual network for image classification, balancing efficiency and accuracy for vision tasks with limited compute.
microsoft-resnet18 |
| Image to Text |
| Active |
| Image |
| Text |
Capabilities
Input1/5
TextΒ·
Imageβ
AudioΒ·
VideoΒ·
PDFΒ·
Output1/5
Textβ
ImageΒ·
AudioΒ·
VideoΒ·
EmbeddingΒ·
Capabilities0/13
ReasoningΒ·
Adaptive ReasoningΒ·
Function CallingΒ·
Parallel Function CallingΒ·
Structured OutputsΒ·
Native JSON SchemaΒ·
Web SearchΒ·
URL ContextΒ·
Computer UseΒ·
Code ExecutionΒ·
File SearchΒ·
Prompt CachingΒ·
Assistant PrefillΒ·
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| ResNet V2 101 | β | β | β | β | Available |
| ResNet V2 50 | β | β | β | β | Available |
| ResNet V2 Classification | β | β | β | β | Available |
| ResNet V2 Featurevector | β | β | β | β | Available |
| ResNet V1 101 | β | β | β | β | Available |
| ResNet V1 152 | β | β | β | β | Available |
| ResNet V1 50 | β | β | β | β | Available |
| ResNet V1 Classification | β | β | β | β | Available |
| ResNet 18 | β | β | β | β | Current |
| ResNet 101 | β | β | β | β | Available |
| ResNet 152 | β | β | β | β | Available |