Google has introduced a new generative AI tool called Whisk, which enables users to create images using visual cues instead of text descriptions. Users can upload images that define the plot, scene, or style, and Whisk combines these elements to generate unique visualizations.
Whisk is powered by the Imagen 3 image generation model, allowing for fast image creation. Google emphasizes that Whisk is designed for “quick visual exploration, not precise pixel editing.” Users can modify the initial prompts if the results do not meet their expectations.
In a Google blog post, it is highlighted that Whisk may not always accurately reproduce the desired features, so users have the option to review and edit prompts to improve results. Currently, the tool is available only in the US via Google Labs, allowing the company to collect user feedback to enhance its generative AI technologies.