Cosmos 3 Super Text2Image Agentic is NVIDIA's image generation model. An NVIDIA Cosmos text-to-image generation model with agentic capabilities, enabling programmatic and autonomous image synthesis workflows.
Specifications
Canonical IDnvidia-cosmos-3-super-text2image-agentic
TypeImage Generation
StatusActive
CreatorNVIDIANVIDIA
Input ModalitiesText
Output ModalitiesImage

Capabilities

Input1/5
Text
Image·
Audio·
Video·
PDF·
Output1/5
Text·
Image
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
Cosmos 3 Super Text2Image AgenticCurrent
Cosmos 3 Super Image2VideoAvailable

Other Models

ModelTierReleasedContextInput / 1MOutput / 1M
Cosmos
Cosmos Reason1 7B

Model IDs

cosmos3-super-text2image-agentic
nvidia-cosmos-3-super-text2image-agentic