MobileNet V2 0.75 224 is Google's image to text model. A lightweight image classification feature extractor based on MobileNet V2 with 0.75 width multiplier and 224px input resolution, optimized for efficient on-device inference.
Specifications
Canonical IDgoogle-mobilenet-2-75-224
TypeImage to Text
StatusActive
CreatorGoogleGoogle
Input ModalitiesImage
Output ModalitiesText

Capabilities

Input1/5
Text·
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
MobileNet 3Available
MobileNet V3 LargeAvailable
MobileNet V2 0.75 224Current
MobileNet 2Available
MobileNet 2 ClassificationAvailable
MobileNet 2 FeaturevectorAvailable
MobileNet V2 0.35 224 Feature VectorAvailable
MobileNet V2 1.30 224 Feature VectorAvailable
MobileNet V2 1.40 224 Feature VectorAvailable
MobileNet V1 0.25 128Available
MobileNet V1 0.25 128 Feature VectorAvailable

Model IDs