Google logo

MobileNet 3


MobileNet 3 is Google's image to text model. A MobileNetV3 image classification model offering improved accuracy-efficiency trade-offs in small and large variants.
Specifications
Canonical IDgoogle-mobilenet-3
TypeImage to Text
StatusActive
CreatorGoogleGoogle
Input ModalitiesImage
Output ModalitiesText

Capabilities

Input1/5
TextΒ·
Imageβœ“
AudioΒ·
VideoΒ·
PDFΒ·
Output1/5
Textβœ“
ImageΒ·
AudioΒ·
VideoΒ·
EmbeddingΒ·
Capabilities0/13
ReasoningΒ·
Adaptive ReasoningΒ·
Function CallingΒ·
Parallel Function CallingΒ·
Structured OutputsΒ·
Native JSON SchemaΒ·
Web SearchΒ·
URL ContextΒ·
Computer UseΒ·
Code ExecutionΒ·
File SearchΒ·
Prompt CachingΒ·
Assistant PrefillΒ·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
MobileNet 3β€”β€”β€”β€”Current
MobileNet V3 Largeβ€”β€”β€”β€”Available
MobileNet 2β€”β€”β€”β€”Available
MobileNet 2 Classificationβ€”β€”β€”β€”Available
MobileNet 2 Featurevectorβ€”β€”β€”β€”Available
MobileNet V2 0.35 224 Feature Vectorβ€”β€”β€”β€”Available
MobileNet V2 0.75 224β€”β€”β€”β€”Available
MobileNet V2 1.30 224 Feature Vectorβ€”β€”β€”β€”Available
MobileNet V2 1.40 224 Feature Vectorβ€”β€”β€”β€”Available
MobileNet V1 0.25 128β€”β€”β€”β€”Available
MobileNet V1 0.25 128 Feature Vectorβ€”β€”β€”β€”Available

Model IDs