The Japanese Sakana AI Lab introduced a new technique that allows several large language models to work together on a single task. The Multi-LLM AB-MCTS system combines the capabilities of different AIs to solve complex tasks that are beyond the reach of any single model. The approach involves each model participating in a trial-and-error process and then collectively finding the best solution.
This development is based on the Adaptive Branching Monte Carlo Tree Search algorithm, which allows the model to decide whether to improve an already obtained answer or to create a new one. The system not only chooses the search strategy but also determines which model is better suited for a specific stage of the task. Initially, all models are engaged equally, but over time, the algorithm favors those that show better results.
Researchers tested Multi-LLM AB-MCTS on the ARC-AGI-2 test, which assesses the AI’s ability to solve complex visual reasoning tasks. The collective of models, including o4-mini, Gemini 2.5 Pro, and DeepSeek-R1, managed to correctly complete over 30% of the 120 tasks, significantly surpassing the results of individual models. In some cases, the solution of one model was incorrect, but the system passed it to other models, which found and corrected the errors.
Experts note that this approach helps reduce the number of erroneous responses and allows combining the strengths of different models. This can be useful for businesses where the accuracy and reliability of AI work are important. For developers, Sakana AI has opened access to the algorithm through the open-source framework TreeQuest, which can be used in their own projects for commercial purposes.