Alibaba has introduced new capabilities for image editing in its Qwen-Image-Edit model, which operates on the 20-billion Qwen-Image model. The system combines two processing strategies: “Qwen2.5-VL” is responsible for semantic control, while the Variational Autoencoder alters the appearance of the image. Users can make both simple adjustments and perform complex semantic changes while keeping the main object recognizable.

Qwen-Image-Edit allows users to modify specific areas of a photo without affecting other parts or completely change the image while preserving the main object. For example, new versions of the Capybara mascot can be created for use in stickers or messengers, and the angles of objects, people, or animals can be changed by 90 or 180 degrees. The tool supports style changes, such as transforming portraits into Studio Ghibli style.
The editor also enables adding captions with realistic shadows, changing letter colors, removing unnecessary nuances from the image, and editing the background or clothing.
One of the main advantages of Qwen-Image-Edit is the ability to edit text on images in Chinese and English. Users can add, delete, or change text without losing font, size, and style. To do this, the necessary area must be selected, after which the model updates the marked areas. If the result is not perfect, changes can be gradually refined until the desired appearance is achieved.
Qwen-Image-Edit is now available through the “Image Editing” feature in Qwen Chat, as well as on Github, Hugging Face, and Modelscope. Alibaba claims leading results of the model in open image editing tests but has not released exact figures.