By using this website, you agree to our Privacy Policy and Terms of Use.
Accept
Craftium.AICraftium.AICraftium.AI
  • Home
  • News
  • Knowledge base
  • Catalog
  • Blog
Font ResizerAa
Craftium.AICraftium.AI
Font ResizerAa
Пошук
  • Home
  • News
  • Catalog
  • Collections
  • Blog
Follow US
  • Terms of Use
  • Privacy Policy
  • Copyright
  • Feedback
© 2024-2026 Craftium.AI.

A new AI test revealed unexpected model features

Researchers used radio show puzzles to evaluate the intuitive capabilities of artificial intelligence without specialized knowledge

Alex Dubenko
Alex Dubenko
Published: 06.02.2025
News
266 Views
Sunday Puzzle
Sunday Puzzle (npr.org)
SHARE

Researchers from several US universities and the startup Cursor have developed a new test to assess the capabilities of generative AI models. They used puzzles from the radio show “Sunday Puzzle ,” broadcast on NPR. This test revealed unexpected features in model behavior, such as the fact that some models, like those from OpenAI, sometimes “give up” and provide incorrect answers.

Interestingly, the test includes puzzles that are understandable without specialized knowledge, making it accessible to a wide audience. The “Sunday Puzzle” does not require models to have specific expertise, and the problems are formulated so that models cannot rely on “mechanical memory.” This makes the test appealing to researchers seeking to understand how AI models solve tasks that require intuition and the process of elimination.

Read also

Translators
OpenAI launches ChatGPT Translate for online text translation
OpenAI enhances ChatGPT’s voice capabilities for expansion into new devices
ChatGPT received new flexible response personalization settings

Currently, the best results on the test were achieved by the o1 model with a score of 59%, while the new o3-mini model, tuned for high reasoning effort, scored 47%. The researchers plan to expand testing to other models to determine how their performance can be improved. This could help identify which aspects of model operation need enhancement.

However, the “Sunday Puzzle” test has its limitations, as it is aimed at an English-speaking audience. Nevertheless, the researchers believe that regularly updating the questions will help keep the test relevant and allow them to track how model performance changes over time.

OpenAI launches a global app directory for ChatGPT
OpenAI updated GPT Image 1.5 for ChatGPT with new editing capabilities
OpenAI prepares “adult mode” for ChatGPT in 2026
Disney invests a billion in OpenAI to create videos with characters
OpenAI launched GPT-5.2 with new operating modes
TAGGED:OpenAITesting
SOURCES:techcrunch.com
Leave a Comment

Leave a Reply Cancel reply

Follow us

XFollow
YoutubeSubscribe
TelegramFollow
MediumFollow

Popular News

Qwen-Image-2512
Alibaba introduced the open model Qwen-Image 2512 for image generation
05.01.2026
Beam
Beam allows you to create interactive AI videos and games online
19.12.2025
Illustrative image
Alibaba released Qwen-Image-Layered for layered image generation
25.12.2025
Meta
Meta is working on new AI models for content management
19.12.2025
Gemini
Google introduced the fast AI model Gemini 3 Flash for all users
18.12.2025

Читайте також

Image generated in Hazelnut
News

OpenAI may be preparing a new image generation model — first test results

10.12.2025
Robot battle
News

OpenAI prepares to release the Image-2 model for next-level image generation

10.12.2025
Search ChatGPT
News

OpenAI integrates voice function into ChatGPT chat window

26.11.2025

Craftium AI is a team that closely follows the development of generative AI, applies it in their creative work, and eagerly shares their own discoveries.

Navigation

  • News
  • Reviews
  • Collections
  • Blog

Useful

  • Terms of Use
  • Privacy Policy
  • Copyright
  • Feedback

Subscribe for AI news, tips, and guides to ignite creativity and enhance productivity.

By subscribing, you accept our Privacy Policy and Terms of Use.

Craftium.AICraftium.AI
Follow US
© 2024-2026 Craftium.AI
Subscribe
Level Up with AI!
Get inspired with impactful news, smart tips and creative guides delivered directly to your inbox.

By subscribing, you accept our Privacy Policy and Terms of Use.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?