Google logo

AutoML Vision Image Object Detection


AutoML Vision Image Object Detection is Google's image to text model. Google Cloud's AutoML Vision model for automatically training custom object detection models to identify and localize objects in images.
Specifications
Canonical IDgoogle-automl-vision-image-object-detection
TypeImage to Text
StatusActive
CreatorGoogleGoogle
Input ModalitiesImage
Output ModalitiesText

Capabilities

Input1/5
Text·
Image
Audio·
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
AutoML Vision Image Object DetectionCurrent
AutoML E2EAvailable
AutoML Vision Image ClassificationAvailable

Model IDs