Evidence is growing that LLMs will never be the route to AGI. They are consuming exponentially increasing energy, to deliver only linear improvements in performance.

Lugh · 1 year ago

Evidence is growing that LLMs will never be the route to AGI. They are consuming exponentially increasing energy, to deliver only linear improvements in performance.

Lugh · 1 year ago

Added to this finding, there’s a perhaps greater reason to think LLMs will never deliver AGI. They lack independent reasoning. Some supporters of LLMs said reasoning might arrive via “emergent behavior”. It hasn’t.

People are looking to get to AGI in other ways. A startup called Symbolica says a whole new approach to AI called Category Theory might be what leads to AGI. Another is “objective-driven AI”, which is built to fulfill specific goals set by humans in 3D space. By the time they are 4 years old, a child has processed 50 times more training data than the largest LLM by existing and learning in the 3D world.

conciselyverbose@sh.itjust.works · 1 year ago

They can quite possibly be a useful component. They’re the language center of the brain.

People who ever thought they would actually resemble intelligence were woefully uninformed of how complex intelligence is.

CanadaPlus@lemmy.sdf.org · edit-2 1 year ago

How complex is intelligence, though? People who were sure they don’t were drawing from information we don’t actually have.

FaceDeer@kbin.social · 1 year ago

Yeah, so many people are confidently stating “LLMs can’t think like humans do!” When we’re actually still pretty unclear on how humans think.

Sure, an LLM on its own may not be an AGI. But they’re remarkably closer than we would have predicted they could get just a few years ago, and it may well be that we just need to add a bit more “special sauce” (memory, prompting strategies, perhaps a couple of parallel LLMs that specialize in different types of reasoning) to get them over the hump. At this point a lot of the research isn’t going into simply “make it bigger!”, it’s going into “use LLMs smarter.”

conciselyverbose@sh.itjust.works · 1 year ago

deleted by creator

conciselyverbose@sh.itjust.works · edit-2 1 year ago

Obscenely.

The brain is stacks on stacks of insanely complicated systems. The fact that we know a ridiculous amount about the brain and are barely scratching the surface is exactly the point.

CanadaPlus@lemmy.sdf.org · edit-2 1 year ago

By that measure, we know everything about GPT-2, but again are just scratching the surface of how it works. I don’t think you can draw the conclusion that LLMs can never be intelligent just from that.

conciselyverbose@sh.itjust.works · 1 year ago

We “know everything about it” because it’s not that complicated.

You don’t need to process every individual step a search algorithm has to understand how it works. LLMs are the same thing. They’re just a big box of weighted probabilities. Complexity is more than just having a really big model.

We have bits and pieces of a lot of parts, but are nowhere near a complete understanding of any of them. We kind of know how neurotransmitters work, we kind of know how hormones work and interact with those neurotransmitters, we mostly know how individual neurons fire, we kind of know what different parts of the brain do, we kind of know how the brain adapts to physical damage…

We don’t know any of the algorithms it follows. What we do know that it’s a hell of a lot of interconnected parts, and they’re all following very different rules.

CanadaPlus@lemmy.sdf.org · 1 year ago

It’s not a search algorithm. If it is, that’s an overfitted model, and it’s detected and rejected. What a good foundation model is doing is just about as mysterious as the brain.

conciselyverbose@sh.itjust.works · 1 year ago

It’s fundamentally extremely comparable mathematically and algorithmically. That’s the point. Simulated annealing doesn’t need to understand the search space to find a pretty good answer to a problem. It just needs to know what a good answer approximately looks like and nudge potential answers closer that way.

What LLMs are doing is not mysterious at all. Why a specific point in a model is what it is is, but there’s no mystery to the algorithm. We can’t even guess at most of the algorithms that make up the brain.

CanadaPlus@lemmy.sdf.org · 1 year ago

Simulated annealing is a search algorithm which finds a solution.

Backpropagation is a search algorithm which finds a function, which in a big enough network could be literally any of them that are computable. Once the network is trained and rolls out for consumers, backpropagation isn’t used at all.

Those are two fundamentally different things. GPT-2 is trained, and is no longer a search algorithm by any useful definition. There’s examples of small neural nets we can understand, and they’re not doing search algorithms; Quanta did a story about some just last week. If you can do simulated annealing you should probably just look into NN algorithms in detail yourself, because then you can know how that’s wrong without the internet’s help.

CubitOom@infosec.pub · 1 year ago

I wonder where the line is drawn between an emergent behavior and a hallucination.

If someone expects factual information and gets a hallucination, they will think the llm is dumb or not helpful.

But if someone is encouraging hallucinations and wants fiction, they might think it’s an emergent behavior.

In humans, what is the difference between an original thought, and a hallucination?

Umbrias@beehaw.org · edit-2 1 year ago

Hallucinations are unlike Human creative output. For one, ai hallucinations are unintentional. There’s plenty of reasons if you actually think about the question why they are not the same. They are at best dreamlike, but dreams are an intentional process.

CubitOom@infosec.pub · 1 year ago

Sure there is intentional creative thought. But there are also unintentional creative thoughts. Moments of clarity, eureka moments, and strokes of inspiration. How do we differentiate these?

If we were to say that it is because of our subconscious is intentionally promoting these thoughts. Then we would need a method to test that, because otherwise the difference is moot.

Similar to how one might define the I in AGI it’s hard to form a consensus on general and often vague definitions like these.

Umbrias@beehaw.org · 1 year ago

You are assigning far more vague grandeur to ai hallucinations than what they are in practice.

CubitOom@infosec.pub · 1 year ago

Maybe it’s this arbitrary word, hallucination? Which was recently borrowed from the human experience to explain why something which normally is factual like a computer is not computing facts.

But if one were to think about it, what is the difference between a series on non factual hallucinations in a model and a person’s individual experience of the world?

If two people eat the same food item they might taste different things.
they might have different definitions of the same word.
they might remember that an object was a different color then someone’s recording could prove. There is a reason why eye witness testimony is considered unreliable in the court of law.

Before, we called these bugs or even issues. But now that it’s in this black box of sorts that we can’t alter the decision making process of as directly as before. There is this more human sounding name all of a sudden.

To clarify, when an llm gets a fact wrong because it has limited context or because it’s foundational model is flawed, is that the same result as the experience someone has after consuming psychedelic mushrooms? No, I wouldn’t say so. Nor is it the same when a team of scientists try to make a model actively hallucinate so they can find new chemical compounds.

Defining words can sometimes be very tricky, especially when they are applying to multiple areas of study. The more you drill into a definition, the more it becomes a metaphysical debate. But it is important to have these discussions because even the definition of something like AGI keeps changing. And infact only exist because the goal posts for a AI moved so much. What will stop a company which is trying to attract investors from just slapping an AGI label on their next release? And how will we differentiate what the spirit of the word is trying to convey from the sales pitch?

Umbrias@beehaw.org · 1 year ago

Hallucinations are not qualia.

Please go talk to an llm for hallucinations, you can use duck duck gos implementation of chatgpt, and see why it’s being used to mean a fairly different thing from human hallucinations.

steventrouble@programming.dev · edit-2 1 year ago

deleted by creator