Alibaba introduced the Qwen VLo model for step-by-step image generation

The new multimodal model allows changing backgrounds, combining multiple images, and works with text in two languages

Published: 06.07.2025

313 Views

Image generated in Qwen VLo

The company Alibaba introduced the multimodal AI Qwen VLo, which analyzes, creates, and edits images based on text prompts. Qwen VLo generates images gradually, step by step, from left to right and top to bottom, allowing for better control over the result and is especially useful for long text descriptions.

The model understands complex natural language instructions. Users can change the background, add new objects, change the image style, and combine multiple images into one.

Qwen VLo supports both artistic and technical changes. It creates segmentation maps, performs contour detection, and forms depth maps with color overlays. The model also recognizes parts of the image and assesses the scene’s depth.

The system works with various image resolutions and proportions, including extreme formats like 4:1 or 1:3, although this feature is not yet activated. It processes requests in both Chinese and English.

Currently, Qwen VLo is available for exploration in Qwen Chat . The company reports some generation errors, source mismatches, and difficulties with executing detailed instructions but plans to improve the model’s stability and reliability.

TAGGED:Alibaba Image generation Qwen Qwen Chat

Alibaba introduced the Qwen VLo model for step-by-step image generation

Leave a Reply Cancel reply

Follow us

Popular News

Claude Opus 4.6 topped the AI data analysis ranking

OpenAI presented GPT 5.3 Codex for development automation

Google Adds Personal Settings to NotebookLM for Users

Seedance 2.0 creates a wave of celebrity videos online

Amazon MGM Studios Tests AI Studio for Film Production

Navigation

Useful

Читайте також

Leave a Reply Cancel reply

Follow us

Popular News

Читайте також

Level Up with AI!