Alibaba Model Can Clone Voice from 3 Seconds of Audio

New tools allow easy creation of personalized assistants and voice-overs, working with multiple languages and styles

Published: 24.12.2025

252 Views

Illustrative image.

The Qwen team from Alibaba Cloud has introduced two new AI models that allow creating or copying voices using text commands. Both models can generate speech based on text and reproduce a voice similar to the original after listening to just three seconds of audio.

Users can input text, and the system converts it into speech with specified characteristics. A short audio fragment is sufficient for voice cloning, making the process quick and convenient. The models support various languages, including English and Chinese, and work with intonation and speech style.

Developers reported that these AI models can be used to create personalized voice assistants, voice-over for videos or audiobooks, as well as for educational and entertainment applications. The service is aimed at a wide audience, including developers and regular users.

Alibaba Cloud plans to further enhance these tools and expand their features, focusing on user data security and protection. New capabilities are already available for testing through the company’s official channels.

TAGGED:Alibaba Qwen Voice cloning Voice generation

Alibaba Model Can Clone Voice from 3 Seconds of Audio

Leave a Reply Cancel reply

Follow us

Popular News

Navigation

Useful

Read Also

Leave a Reply Cancel reply

Follow us

Popular News

Читайте також

Alibaba released the lightweight Z-Image-Turbo model for uncensored image generation

Alibaba announced a compact model Z-Image for image generation

Qwen-Image-Edit-2509 allows editing multiple images simultaneously

Level Up with AI!