Name: Faster R-CNN Inception-ResNet 2
Brand: Google

Faster R-CNN Inception-ResNet 2 is Google's image to text model. Faster R-CNN object detection model using an Inception-ResNet V2 backbone for high-accuracy detection at 1024×1024 resolution.

Specifications
Canonical ID	`google-faster-rcnn-inception-resnet-2`
Type	Image to Text
Status	Active
Creator	Google
Input Modalities	Image
Output Modalities	Text

Capabilities

Input1/5

Text·

Image✓

Audio·

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities0/13

Reasoning·

Adaptive Reasoning·

Function Calling·

Parallel Function Calling·

Structured Outputs·

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
Faster R-CNN MobileNet V3	—	—	—	—	Available
Faster R-CNN Inception-ResNet 2	—	—	—	—	Current
Faster R-CNN	—	—	—	—	Available
Faster R-CNN ResNet-101	—	—	—	—	Available
Faster R-CNN ResNet-152	—	—	—	—	Available
Faster R-CNN ResNet-50	—	—	—	—	Available
Faster R-CNN ResNet50	—	—	—	—	Available

Faster R-CNN Inception-ResNet 2

Capabilities

Versions

Model IDs