Name: Base Video
Brand: Deepgram

Base Video is Deepgram's speech to text model. Deepgram's base-tier ASR model designed for transcribing speech from video content.

Specifications
Canonical ID	`deepgram-base-video`
Type	Speech to Text
Status	Active
Creator	Deepgram
Providers	Deepgram
Input Modalities	Audio
Output Modalities	Text

Capabilities

Input1/5

Text·

Image·

Audio✓

Video·

PDF·

Output1/5

Text✓

Image·

Audio·

Video·

Embedding·

Capabilities0/13

Reasoning·

Adaptive Reasoning·

Function Calling·

Parallel Function Calling·

Structured Outputs·

Native JSON Schema·

Web Search·

URL Context·

Computer Use·

Code Execution·

File Search·

Prompt Caching·

Assistant Prefill·

Pricing by Provider

US Dollar ($)

Per 1M tokens

Provider	Standard
Provider	Audio In $ / sec
Deepgram `deepgram/base-video`	$0.000208

Cost Calculator

US Dollar ($)

Preset:

Input tokens

Output tokens

Number of calls

Versions

Version	Released	Context	Input / 1M	Output / 1M	Status
Base Video	—	—	—	—	Current
Base ConversationalAI	—	—	—	—	Available
Base Finance	—	—	—	—	Available
Base General	—	—	—	—	Available
Base Meeting	—	—	—	—	Available
Base Phonecall	—	—	—	—	Available
Base Voicemail	—	—	—	—	Available
Deepgram Base	—	—	—	—	Available
Deepgram Enhanced	—	—	—	—	Available
Enhanced Finance	—	—	—	—	Available
Enhanced General	—	—	—	—	Available

Model IDs

deepgram-base-video

deepgram/base-video

Base Video

CapabilitiesAPIGET/api/v1/models/deepgram-base-video

Pricing by ProviderAPIGET/api/v1/models/deepgram-base-video/pricing

Cost CalculatorAPIGET/api/v1/models/deepgram-base-video/pricing/calculate?input_tokens=1000000&output_tokens=500000

VersionsAPIGET/api/v1/models?family=deepgram

Model IDsAPIGET/api/v1/models/deepgram-base-video

Capabilities

Pricing by Provider

Cost Calculator

Versions

Model IDs