Name: HEAR
Brand: Google

HEAR is Google's speech to text model. Google's Holistic Evaluation of Audio Representations benchmark model for general-purpose audio feature extraction and evaluation.

Specifications
Canonical ID	`google-hear`
Type	Speech to Text
Status	Active
Creator	Google
Input Modalities	Audio
Output Modalities	Text

Capabilities

Input1/5

Text·

Image·

Audio✓

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities0/13

Reasoning·

Adaptive Reasoning·

Function Calling·

Parallel Function Calling·

Structured Outputs·

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Model IDs

google-hear

publishers/google/models/hear

HEAR

CapabilitiesAPIGET/api/v1/models/google-hear

Model IDsAPIGET/api/v1/models/google-hear

Capabilities

Model IDs