Recently, after the release of the OpenAI o1 model for “logical” reasoning, users began to notice an interesting phenomenon. The model sometimes starts “thinking” in another language, such as Chinese or Persian, even if the question is asked in English. When solving tasks like “How many ‘R’s are in the word ‘strawberry’?,” o1 first performs a series of logical steps, and the final answer is provided in English. However, some steps are carried out in another language.
OpenAI has not provided an explanation for this model behavior, but experts have put forward various theories. Some point out that models like o1 are trained on datasets containing many Chinese characters. Others believe this may result from using Chinese data labeling services during model training. Ted Xiao from Google DeepMind suggested that the influence of Chinese linguistics on logic could be caused by involving third-party Chinese data providers for scientific and mathematical tasks.
Other experts disagree with this hypothesis, noting that o1 can just as easily switch to Hindi, Thai, or another language. This may be because models choose the language they consider most efficient for achieving their goal. Matthew Guzdial from the University of Alberta noted that for the model, all languages are simply text, and it does not understand the difference between them.
Tezheng Wang from Hugging Face agrees that language inconsistencies can be explained by associations the models form during training. For example, for performing mathematical calculations, Chinese may be more convenient due to the brevity of words. However, for other topics, such as unconscious bias, models may automatically switch to English. Despite various theories, the exact reason remains unknown.