Apple Research Questions AI Reasoning Mo…

06.09.2025

A newly published Apple Machine Learning Research study has challenged the prevailing narrative around AI "reasoning" large-language models like OpenAI's o1 and Claude's thinking variants, revealing fundamental limitations that suggest

For the study, rather than using standard math benchmarks that are prone to data contamination, Apple researchers designed controllable puzzle environments including Tower of Hanoi and River Crossing. This allowed a precise analysis of both the final answers and the internal reasoning traces across varying complexity levels, according to the researchers.

The results are striking, to say the least. All tested reasoning models – including o3-mini, DeepSeek-R1, and Claude 3.7 Sonnet – experienced complete accuracy collapse beyond certain complexity thresholds, and dropped to zero success rates despite having adequate computational resources. Counterintuitively, the models actually reduce their thinking effort as problems become more complex, suggesting fundamental scaling limitations rather than resource constraints.

Perhaps most damning, even when researchers provided complete solution algorithms, the models still failed at the same complexity points. Researchers say this indicates the limitation isn't in problem-solving strategy, but in basic logical step execution.

Models also showed puzzling inconsistencies – succeeding on problems requiring 100+ moves while failing on simpler puzzles needing only 11 moves.

The research highlights three distinct performance regimes: standard models surprisingly outperform reasoning models at low complexity, reasoning models show advantages at medium complexity, and both approaches fail completely at high complexity. The researchers' analysis of reasoning traces showed inefficient "overthinking" patterns, where models found correct solutions early but wasted computational budget exploring incorrect alternatives.

The take-home of Apple's findings is that current "reasoning" models rely on sophisticated pattern matching rather than genuine reasoning capabilities. It suggests that LLMs don't scale reasoning like humans do, overthinking easy problems and thinking less for harder ones.

The timing of the publication is notable, having emerged just days before WWDC 2025, where Apple is expected to limit its focus on AI in favor of new software designs and features, according to Bloomberg.

Tag: Apple Research

This article, "Apple Research Questions AI Reasoning Models Just Days Before WWDC" first appeared on MacRumors.com

Discuss this article in our forums

original link

iPhone 15 and 15 Plus cou…

05.14.2023

iPhone 14 Pro and Pro Max got a major camera upgrade with a 48MP main sensor and earlier this year, analyst Jeff Pu said he expected that would trickle down

Five boring apps that App…

06.14.2023

Apple's next leap in computing rests on its Vision Pro headset and the apps people have to use with it. The company shouldn't forget about some of its more boring

9to5Mac Daily: November 8…

11.09.2024

Listen to a recap of the top stories of the day from 9to5Mac. 9to5Mac Daily is available on iTunes and Apple’s Podcasts app, Stitcher, TuneIn, Google Play, or through our dedicated RSS feed for Overcast and other

Nomad launches limited ed…

01.07.2025

Nomad is back at it again, bringing the glow to your wrist with their new limited edition Glow in the Dark Icy Blue Sport Band. This fresh twist on their

Score huge Prime Day deal…

10.08.2024

Here's a great set of deals for Prime Day: M3 MacBook Air models for up to $275 off regular prices. Just pick your specs and go. (via Cult of Mac

Apple’s Tim Cook one of t…

10.05.2023

Four out of five Apple employees approve of the work Tim Cook does as CEO, according to a new approval rating survey. That compares to just four out of one

tvOS 17 brings FaceTime a…

06.05.2023

Apple today announced software updates coming this fall that make Apple TV 4K even more enjoyable, interactive, and fun for the whole household.

iPhone 15 Pro and Pro‌ Ma…

07.27.2023

The new iPhone 15 Pro and Pro Max are rumored to get a new “Action” button (think Apple Watch Ultra) that’s likely to be used for accessibility, shortcuts, silent mode,

Click it!

Search

ipinfo.io

user online431

your visit count

your ID

cityipinfo.io error

postalipinfo.io error

regionipinfo.io error

countryipinfo.io error

timezoneipinfo.io error

locipinfo.io error, ipinfo.io error

orgipinfo.io error

hostipinfo.io error

OSUnknown Operating System

IP216.73.216.120

languageipinfo.io error

browser ↓

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)

support me with your mouse

findip.net

user online431

your visit count

your ID

cityColumbus

continentNA

countryUnited States

system16509

providerAmazon.com

Time ZoneAmerica/New_York

Weather CodeUSOH0212

Subdivision NameFranklin

loc39.9612,-82.9988

orgAmazon.com, Inc.

connectionCorporate