Google logo

BiT-S R50x3 Feature Vector


BiT-S R50x3 Feature Vector is Google's image to text model. A Big Transfer (BiT-S) vision model with a ResNet-50x3 backbone trained on ILSVRC-2012, producing feature vectors with increased width for richer image representations.
Specifications
Canonical IDgoogle-bit-s-r50x3-ilsvrc2012
TypeImage to Text
StatusActive
CreatorGoogleGoogle
Input ModalitiesImage
Output ModalitiesText

Capabilities

Input1/5
TextΒ·
Imageβœ“
AudioΒ·
VideoΒ·
PDFΒ·
Output1/5
Textβœ“
ImageΒ·
AudioΒ·
VideoΒ·
EmbeddingΒ·
Capabilities0/13
ReasoningΒ·
Adaptive ReasoningΒ·
Function CallingΒ·
Parallel Function CallingΒ·
Structured OutputsΒ·
Native JSON SchemaΒ·
Web SearchΒ·
URL ContextΒ·
Computer UseΒ·
Code ExecutionΒ·
File SearchΒ·
Prompt CachingΒ·
Assistant PrefillΒ·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
BiT-S R50x3 Feature Vectorβ€”β€”β€”β€”Current
BiT-M Classificationβ€”β€”β€”β€”Available
BiT-M Feature Vectorβ€”β€”β€”β€”Available
BiT-M R50x3β€”β€”β€”β€”Available
BiT-M R50x3 ImageNet-21kβ€”β€”β€”β€”Available
BiT-M R50x3 ImageNet-21kβ€”β€”β€”β€”Available
BiT-S R101x1β€”β€”β€”β€”Available
BiT-S R101x1 Feature Vectorβ€”β€”β€”β€”Available
BiT-S R101x3β€”β€”β€”β€”Available
BiT-S R101x3 Feature Vectorβ€”β€”β€”β€”Available
BiT-S R152x4β€”β€”β€”β€”Available

Model IDs