Google DeepMind introduced SIMA 2 — a new generation of a universal AI agent that combines the language and logical capabilities of the Gemini model. SIMA 2 learns not only to execute commands but also to understand and interact with its environment in virtual worlds. Unlike the first version, which could perform simple tasks in video games, SIMA 2 shows twice the performance in complex tasks and independently improves its skills based on its own experience.
The SIMA 2 model operates on the Gemini 2.5 flash-lite base and can perform tasks in new gaming environments, even if it has not encountered them before. During a demonstration in the game, the agent described the surrounding world, determined the next actions, and found objects for interaction. Additionally, SIMA 2 can understand abstract hints, such as using emojis, and explain its decisions while performing tasks.
Among the new features is self-learning without a large amount of human data. SIMA 2 receives initial knowledge from gaming data and then creates new tasks for itself and evaluates the results using additional Gemini models. This allows the agent to learn from its own mistakes and gradually improve its behavior in virtual worlds.
DeepMind sees SIMA 2 as a step towards creating universal robots and systems that can perform various tasks in the real world. Although the company has not yet announced the timeline for implementing SIMA 2 in physical robotics, this model is currently available to researchers and developers in a limited preview format.

