Alibaba has added new features to its chatbot Qwen Chat — it can now speak and see. In voice mode, you can interact with the AI in real time. At the same time, the bot can work with the camera: it sees objects, reacts to them, and describes them. All of this is available for free, right from your browser.

A voice chat session lasts up to three minutes. You can choose from several voices, including the “Virtual Girlfriend” voice — whose intonation suggests that the conversation can be more than just informative. The system works quickly, with responses sounding natural and without delays.
The features are available in English and Chinese. If you start speaking another language, the virtual interlocutor switches to Chinese. If you turn off the microphone, you can switch to text mode. But the main highlight is the combination of voice, visual analysis, and live reaction.
Along with the Qwen Chat update, the company introduced the new visual model QVQ-Max. It analyzes images and videos, can recognize faces, and provides commentary on what it sees.