Researchers tested Large Reasoning Models on various puzzles. As the puzzles got more difficult the AIs failed more, until at a certain point they all failed completely.

Even without the ability to reason, current AI will still be revolutionary. It can get us to Level 4 self-driving, and outperform doctors, and many other professionals in their work. It should make humanoid robots capable of much physical work.

Still, this research suggests the current approach to AI will not lead to AGI, no matter how much training and scaling you try. That’s a problem for the people throwing hundreds of billions of dollars at this approach, hoping it will pay off with a new AGI Tech Unicorn to rival Google or Meta in revenues.

Apple study finds “a fundamental scaling limitation” in reasoning models’ thinking abilities

  • Rin@lemm.ee
    link
    fedilink
    English
    arrow-up
    4
    ·
    9 days ago

    They’re gonna be weak to any puzzle where the solution is a thousand words long

    I did a test. I made my own puzzle in the form of a chessboard. Black pieces meant 0s and white pieces meant 1s. On the board, right to left, top to bottom was encoded an ascii string. No AI I have tried (even o3 & o1-pro at max reasoning) could solve this puzzle without huge huge hand holding. A human could figure it out within 30 mins, I’d say.

    “AGI will never come from LLMs, specifically” is a dead easy claim to believe. Please avoid making it sound like “neural networks are altogether hosed.”

    Of course, but a lot of people (ahm, fuck ai, ahm) don’t seem to understand this. they’ll just circle jerk themselves until their dicks fall off. They see this as “computer will never think”. Also, i’ve seen statistical models do crazy shit for the benefit of humanity. For example, reconstructing a human heart from MRI images and compiling reports that would otherwise take doctors hours to do and more acurately than a doctor would. But again, that’s because that model was not text based.