BiT-M R50x3 is Google's image to text model. A Big Transfer (BiT-M) image classification model using a ResNet-50x3 backbone, pretrained on ImageNet-21k and fine-tuned on ILSVRC-2012 for visual recognition tasks.
google-bit-m-r50x3-ilsvrc2012-classification |
| Image to Text |
| Active |
| Image |
| Text |
Capabilities
Input1/5
TextΒ·
Imageβ
AudioΒ·
VideoΒ·
PDFΒ·
Output1/5
Textβ
ImageΒ·
AudioΒ·
VideoΒ·
EmbeddingΒ·
Capabilities0/13
ReasoningΒ·
Adaptive ReasoningΒ·
Function CallingΒ·
Parallel Function CallingΒ·
Structured OutputsΒ·
Native JSON SchemaΒ·
Web SearchΒ·
URL ContextΒ·
Computer UseΒ·
Code ExecutionΒ·
File SearchΒ·
Prompt CachingΒ·
Assistant PrefillΒ·
Versions
| Version | Released | Context | Input / 1M | Output / 1M | Status |
|---|---|---|---|---|---|
| BiT-M R50x3 | β | β | β | β | Current |
| BiT-M Classification | β | β | β | β | Available |
| BiT-M Feature Vector | β | β | β | β | Available |
| BiT-M R50x3 ImageNet-21k | β | β | β | β | Available |
| BiT-M R50x3 ImageNet-21k | β | β | β | β | Available |
| BiT-S R101x1 | β | β | β | β | Available |
| BiT-S R101x1 Feature Vector | β | β | β | β | Available |
| BiT-S R101x3 | β | β | β | β | Available |
| BiT-S R101x3 Feature Vector | β | β | β | β | Available |
| BiT-S R152x4 | β | β | β | β | Available |
| BiT-S R50x1 | β | β | β | β | Available |