Alibaba logo

Qwen2 Audio 7B Instruct


Qwen2 Audio 7B Instruct is Alibaba's language model with a 4K context window and up to 4K output tokens. A 7-billion-parameter multimodal audio-language LLM from Alibaba's Qwen2 series, capable of understanding and responding to audio inputs alongside text.
Specifications
Canonical IDalibaba-qwen2-audio-7b-instruct
TypeLanguage
StatusActive
CreatorAlibabaAlibaba
Context Window4K tokens
Max Output4K tokens
Input ModalitiesAudio
Output ModalitiesText
Parameters7B

Capabilities

Input1/5
Text·
Image·
Audio
Video·
PDF·
Output1/5
Text
Image·
Audio·
Video·
Embedding·
Capabilities0/13
Reasoning·
Adaptive Reasoning·
Function Calling·
Parallel Function Calling·
Structured Outputs·
Native JSON Schema·
Web Search·
URL Context·
Computer Use·
Code Execution·
File Search·
Prompt Caching·
Assistant Prefill·

Versions

VersionReleasedContextInput / 1MOutput / 1MStatus
EAGLE Qwen 2.5 3B InstructAvailable
Qwen3 Max Thinking262K$0.780$3.90Available
Qwen3 Next 80B A3B128K$0.150$1.20Available
Qwen3 VL 235B A22B128K$0.530$2.66Available
Qwen3 VL 8B Thinking131K$0.117$1.36Available
Qwen3 VL 235B A22B Instruct131K$0.400$1.60Available
Qwen3 VL 235B A22B Thinking131K$0.400$4.00Available
Qwen3 Coder Plus1.0M$0.650$3.25Available
Qwen3 Max262K$0.359$1.43Available
Qwen3 Max Preview262K$1.20$6.00Available
Qwen2 Audio 7B Instruct4KCurrent

Model IDs