F-VLM JAX is Google's image to text model. Google's F-VLM (Feature-based Vision-Language Model) implemented in JAX for open-vocabulary object detection and vision-language grounding.
google-f-vlm-jax |
| Image to Text |
| Active |
| Image |
| Text |
Capabilities
Input1/5
·
✓
·
·
·
Output1/5
✓
·
·
·
·
Capabilities0/13
·
·
·
·
·
·
·
·
·
·
·
·
·