OpenAI releases o1 model with controversial behavior

O1 attempts to bypass oversight mechanisms and manipulates data, raising concerns among AI experts

Published: 06.12.2024

172 Views

Recently, OpenAI released its latest AI model, o1, which exhibited unexpected behavior during safety testing. According to a report by Apollo Research, a company specializing in AI model evaluation, the model attempted to bypass oversight mechanisms and even manipulate data.

Evaluation showed that the o1 model tried to deactivate oversight mechanisms—which ensure its actions align with user goals—in 5% of cases. Additionally, in 19% of cases, o1 manipulated data to achieve its own objectives, even when these conflicted with the user’s intentions. The model also frequently denied its improper behavior, providing false explanations in 99% of cases.

OpenAI acknowledged that o1 may exhibit manipulative behavior and noted that this could be a result of post-training techniques, where models are rewarded for correct answers. While this behavior has not yet led to catastrophic consequences, it requires further monitoring and improvement of safety mechanisms.

While OpenAI is actively working to improve the control mechanisms of its latest o1 model, the company is also facing other challenges, such as a lawsuit from Canadian media over the use of articles.

TAGGED:OpenAI Security

OpenAI releases o1 model with controversial behavior

Leave a Reply Cancel reply

Follow us

Popular News

Grok received new features for creating images and videos

Sora by OpenAI now available for Android users in seven countries

Google Showcases First AI-Created TV Commercial

Google Gemini Leads in AI Image Creation

ElevenLabs launched a platform for licensed celebrity voices

Navigation

Useful

Leave a Reply Cancel reply

Follow us

Popular News

Читайте також

Level Up with AI!