Google logo

BiT-M R50x3 ImageNet-21k


BiT-M R50x3 ImageNet-21k is Google's image to text model. A Big Transfer (BiT-M) image classification model using a ResNet-50x3 backbone trained on ImageNet-21k, designed for transfer learning across diverse visual tasks.
Specifications
Canonical IDgoogle-bit-m-r50x3-imagenet21k-classification
TypeImage to Text
StatusActive
CreatorGoogleGoogle
Input ModalitiesImage
Output ModalitiesText

Capabilities

Input1/5
TextΒ·
Imageβœ“
AudioΒ·
VideoΒ·
PDFΒ·
Output1/5
Textβœ“
ImageΒ·
AudioΒ·
VideoΒ·
EmbeddingΒ·
Capabilities0/13
ReasoningΒ·
Adaptive ReasoningΒ·
Function CallingΒ·
Parallel Function CallingΒ·
Structured OutputsΒ·
Native JSON SchemaΒ·
Web SearchΒ·
URL ContextΒ·
Computer UseΒ·
Code ExecutionΒ·
File SearchΒ·
Prompt CachingΒ·
Assistant PrefillΒ·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
BiT-M R50x3 ImageNet-21kβ€”β€”β€”β€”Current
BiT-M Classificationβ€”β€”β€”β€”Available
BiT-M Feature Vectorβ€”β€”β€”β€”Available
BiT-M R50x3β€”β€”β€”β€”Available
BiT-M R50x3 ImageNet-21kβ€”β€”β€”β€”Available
BiT-S R101x1β€”β€”β€”β€”Available
BiT-S R101x1 Feature Vectorβ€”β€”β€”β€”Available
BiT-S R101x3β€”β€”β€”β€”Available
BiT-S R101x3 Feature Vectorβ€”β€”β€”β€”Available
BiT-S R152x4β€”β€”β€”β€”Available
BiT-S R50x1β€”β€”β€”β€”Available

Model IDs