Inworld Realtime TTS 1.5 Max is Inworld's text to speech model. A high-quality multilingual TTS model from Inworld AI supporting 130+ preset voices across 15 languages with voice cloning, word-level timestamps, and streaming.
Specifications
Canonical IDinworld-realtime-tts-1-5-max
TypeText to Speech
StatusActive
CreatorInworld
Input ModalitiesText
Output ModalitiesAudio

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text·
Image·
Audio
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Inworld Realtime TTS 1.5 MaxCurrent
Inworld TTS 1.5 MaxAvailable

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
Inworld Realtime TTS 2
Step TTS 2
StyleTTS 2
TTS HD 2.5
Inworld TTS 1 Max
TTS 1
TTS 1 HD
Inworld Realtime TTS 1.5 MiniMini
Inworld TTS 1.5 MiniMini
Azure Neural

Model IDs

inworld-ai/realtime-tts-1.5-max
inworld-realtime-tts-1-5-max