Qwen introduced new models for voice, image editing, and content moderation

The new model can instantly transcribe audio, recognize languages, and integrate with various services for complex tasks

Eleni Karasidi

Published: 24.09.2025

News

145 Views

The Qwen AI group of Alibaba introduced the new AI model Qwen3-Omni, which works with text, images, audio, and video in real-time. Qwen3-Omni processes text in 119 languages, recognizes speech in 19 languages, and responds in ten. The model can transcribe up to 30 minutes of audio, and its response delay is only 234 milliseconds. For convenient use, the architecture is divided into two parts: “Thinker” analyzes input data and creates text, while “Talker” immediately converts it into speech, ensuring quick voice output of the result.

Qwen3-Omni showed high results in 32 out of 36 tests on audio and video tasks, outperforming the Gemini 2.5 Flash and GPT-4o models in speech recognition and voice generation. The model uses a mixture-of-experts architecture with the activation of three billion parameters during each request, allowing for fast processing and stable performance even when working with multiple data types simultaneously.

Users can customize the behavior of Qwen3-Omni through special instructions, such as changing the style or “personality” of responses. The model integrates with other tools and services to perform complex tasks. It is available in Qwen Chat , as a demo on Hugging Face, and developers can connect it to their applications via API from Alibaba.

In addition to the basic version, Alibaba released a specialized model Qwen3-Omni-30B-A3B-Captioner for detailed audio descriptions, such as music or sound effects. Versions Qwen3-Omni-30B-A3B-Instruct for instruction execution and Qwen3-Omni-30B-A3B-Thinking for complex reasoning tasks have also been made available.

SOURCES:qwen.ai

Qwen introduced new models for voice, image editing, and content moderation

Leave a Reply Cancel reply

Follow us

Popular News

Grok received new features for creating images and videos

Sora by OpenAI now available for Android users in seven countries

Google Showcases First AI-Created TV Commercial

OpenAI prepares GPT-5.1 for complex user tasks

Google Gemini Leads in AI Image Creation

Navigation

Useful

Read also

Leave a Reply Cancel reply

Follow us

Popular News

Читайте також

Level Up with AI!