TTS HD is Microsoft's text to speech model. A high-definition variant of Microsoft Azure's TTS service offering improved audio fidelity and more natural prosody.
Specifications
Canonical IDmicrosoft-tts-hd
TypeText to Speech
StatusActive
CreatorMicrosoftMicrosoft
Providers
Input ModalitiesText
Output ModalitiesAudio

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text·
Image·
Audio
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Pricing by Provider

US Dollar ($)
Per 1M tokens
ProviderStandard
Audio In
$ / 1K chars
Azure AI Foundry logo
Azure AI Foundry
azure/speech/azure-tts-hd
$0.030

Cost Calculator

US Dollar ($)
Preset:

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Inworld Realtime TTS 2Available
Step TTS 2Available
StyleTTS 2Available
TTS HD 2.5Available
Inworld Realtime TTS 1.5 MaxAvailable
Inworld Realtime TTS 1.5 MiniAvailable
Inworld TTS 1 MaxAvailable
Inworld TTS 1.5 MaxAvailable
Inworld TTS 1.5 MiniAvailable
TTS 1Available
TTS HDCurrent

Model IDs

azure/speech/azure-tts-hd
microsoft-tts-hd