GPT-4 Understands

narwhal@lemmy.ml · 1 year ago

GPT-4 Understands

Veraticus@lib.lgbt · edit-2 1 year ago

[GPT-4] is fed, like, a line of text from some source, but with the last word missing. It guesses what the last word might be, and then it gets told whether or not it got it right so it can adjust its internal math.

GPT-4 cannot alter its weights once it has been trained so this is just factually wrong.

“It had to build, in its internal wirings and all its software neurons, some understanding of what an egg is - In other words, to get the next word right, it had to become intelligent. It’s quite a thought. It started with nothing. We jammed huge oceans of text through it, and it just wired itself into intelligence, just by being trained to do this one stupid thing.”

LLMs are really cool and very useful, don’t get me wrong. But people get excited by what they seem to do and lose sight of what they actually can do. They are not intelligent. They create text based on inputs. That is not what intelligence is, unless you have an extremely dismal view of intelligence that humans are text creation machines with no thoughts, no feelings, no desires, no ability to plan… basically, no internal world at all.

An LLM is an algorithm, not an intelligence.

SirGolan@lemmy.sdf.org · 1 year ago

GPT-4 cannot alter its weights once it has been trained so this is just factually wrong.

The bit you quoted is referring to training.

They are not intelligent. They create text based on inputs. That is not what intelligence is, unless you have an extremely dismal view of intelligence that humans are text creation machines with no thoughts, no feelings, no desires, no ability to plan… basically, no internal world at all.

Recent papers say otherwise.

The conclusion the author of that article comes to (LLMs can understand animal language) is… problematic at the very least. I don’t know how they expect that to happen.

Veraticus@lib.lgbt · 1 year ago

In what sense does your link say otherwise? Is a world model the same thing as intelligence?

SirGolan@lemmy.sdf.org · edit-2 1 year ago

In the end of the bit I quoted you say: “basically no world at all.” But also, can you define what intelligence is? Are you sure it isn’t whatever LLMs are doing under the hood, deep in hidden layers? I guess having a world model is more akin to understanding than intelligence, but I don’t think we have a great definition of either.

Edit to add: More… papers…

Veraticus@lib.lgbt · 1 year ago

But also, can you define what intelligence is?

From the Encyclopedia Britannica:

Human intelligence is a mental quality that consists of the abilities to learn from experience, adapt to new situations, understand and handle abstract concepts, and use knowledge to manipulate one’s environment.

In no sense do LLMs do any of these except, perhaps, “understand and handle abstract concepts.” But since they themselves have no understanding of the concepts, and merely generate text that can simulate understanding, I would call that a stretch.

Are you sure it isn’t whatever LLMs are doing under the hood, deep in hidden layers?

Yes. LLMs are not magic, they are math, and we understand how they work. Deep under the hood, they are manipulating mathematical vectors that in no way are connected representationally to words. In the end, the result of that math is reapplied to a linguistic model and the result is speech. It is an algorithm, not an intelligence.

I’m not really interested in papers that either don’t understand LLMs or play word games with intelligence (shockingly, solipsism is an easy point of view to believe if you just ignore all evidence). For every one of these, you can find a dozen that correctly describe ChatGPT and its limitations. Again, including ChatGPT itself. Why not believe those instead of cherry-pick articles that gratify your ego?

SirGolan@lemmy.sdf.org · edit-2 1 year ago

I’m not really interested in papers that either don’t understand LLMs or play word games with intelligence

I mean, my first paper was from Max Tegmark. My second paper was from Microsoft. You are discounting a well known expert in the field and one of the leading companies working on AI as not understanding LLMs.

Human intelligence is a mental quality that consists of the abilities to learn from experience, adapt to new situations, understand and handle abstract concepts, and use knowledge to manipulate one’s environment.

I note that’s the definition for “human intelligence.” But either way, sure, LLMs alone can’t learn from experience (after training and between multiple separate contexts), and they can’t manipulate their environment. BabyAGI, AgentGPT, and similar things can certainly manipulate their environment using LLMs and learn from experience. LLMs by themselves can totally adapt to new situations. The paper from Microsoft discusses that. However, for sure, they don’t learn the way people do, and we aren’t currently able to modify their weights after they’ve been trained (well without a lot of hardware). They can certainly do in-context learning.

Yes. LLMs are not magic, they are math, and we understand how they work. Deep under the hood, they are manipulating mathematical vectors that in no way are connected representationally to words. In the end, the result of that math is reapplied to a linguistic model and the result is speech. It is an algorithm, not an intelligence.

We understand how they work? From the Wikipedia page on LLMs:

Large language models by themselves are “black boxes”, and it is not clear how they can perform linguistic tasks. There are several methods for understanding how LLM work.

It goes on to mention a couple things people are trying to do, but only with small LLMs so far.

Here’s a quote from Anthropic, another leader in AI:

We understand the math of the trained network exactly – each neuron in a neural network performs simple arithmetic – but we don’t understand why those mathematical operations result in the behaviors we see.

They’re working on trying to understand LLMs, but aren’t there yet. So, if you understand how they do what they do, then please let us know! It’d be really helpful to make sure we can better align them.

they are manipulating mathematical vectors that in no way are connected representationally to words

Is this not what word/sentence vectors are? Mathematical vectors that represent concepts that can then be linked to words/sentences?

Anyway, I think time will tell here. Let’s see where we are in a couple years. :)

I’m not really interested in papers that either don’t understand LLMs or play word games with intelligence

Veraticus@lib.lgbt · edit-2 1 year ago

Large language models by themselves are “black boxes”, and it is not clear how they can perform linguistic tasks. There are several methods for understanding how LLM work.

You are misunderstanding both this and the quote from Anthropic. They are saying the internal vector space that LLMs use is too complicated and too unrelated to the output to be understandable to humans. That doesn’t mean they’re having thoughts in there: we know exactly what they’re doing inside that vector space – performing very difficult math that seems totally meaningless to us.

Is this not what word/sentence vectors are? Mathematical vectors that represent concepts that can then be linked to words/sentences?

The vectors do not represent concepts. The vectors are math. When the vectors are sent through language decomposition they become words, but they were never concepts at any point.

SirGolan@lemmy.sdf.org · 1 year ago

They are saying the internal vector space that LLMs use is too complicated and too unrelated to the output to be understandable to humans.

Yes, that’s exactly what I’m saying.

That doesn’t mean they’re having thoughts in there

I mean. Not in the way we do, and not with any agency, but I hadn’t argued either way on thoughts because I don’t know the answer to that.

we know exactly what they’re doing inside that vector space – performing very difficult math that seems totally meaningless to us.

Huh? We know what they are doing but we don’t? Yes, we know the math, people wrote it. I coded my first neural network 35 years ago. I understand the math. We don’t understand how the math is able to do what LLMs do. If that’s what you’re saying then we agree on this.

The vectors do not represent concepts. The vectors are math. When the vectors are sent through language decomposition they become words, but they were never concepts at any point.

“The neurons are cells. When neurotransmitters are sent through the synapses, they become words, but they were never concepts at any point.”

What do you mean by “they were never concepts”? Concepts of things are abstract. Nothing physical can “be” an abstract concept. If you think about a chair, there isn’t suddenly a physical chair in your head. There’s some sort of abstract representation. That’s what word vectors are. Different from how it works in a human brain, but performing a similar function.

A word vector is an attempt to mathematically represent the meaning of a word.

From this page. Or better still, this article explaining how they are used to represent concepts. Like this is the whole reason vector embeddings were invented.

Veraticus@lib.lgbt · edit-2 1 year ago

We do understand how the math results in LLMs. Reread what I said. The neural network vectors and weights are too complicated to follow for an individual, and do not relate on a 1:1 mapping with the words or sentences the LLM was trained on or will output, so individuals cannot deduce the output of an LLM easily by studying its trained state. But we know exactly what they’re doing conceptually, and individually, and in aggregate. Read your own sources from your previous post, that’s what they’re telling you.

Concepts are indeed abstract but LLMs have no concepts in them, simply vectors. The vectors do not represent concepts in anything close to the same way that your thoughts do. They are not 1:1 with objects, they are not a “thought,” and anyway there is nothing to “think” them. They are literally only word weights, transformed to text at the end of the generation process.

Your concept of a chair is an abstract thought representation of a chair. An LLM has vectors that combine or decompose in some way to turn into the word “chair,” but are not a concept of a chair or an abstract representation of a chair. It is simply vectors and weights, unrelated to anything that actually exists.

That is obviously totally different in kind to human thought and abstract concepts. It is just not that, and not even remotely similar.

You say you are familiar with neural networks and AI but these are really basic underpinnings of those concepts that you are misunderstanding. Maybe you need to do more research here before asserting your experience?

Edit: And in relation to your links – the vectors do not represent single words, but tokens, which indeed might be a whole word, but could just as well be part of a word or an entire phrase. Tokens do not represent the meaning of a word/partial word/phrase, just the statistical use of that word given the data the word was found in. Equating these vectors with human thoughts oversimplifies the complexities inherent in human cognition and misunderstands the limitations of LLMs.

BitSound@lemmy.world · edit-2 1 year ago

You really, truly don’t understand what you’re talking about.

The vectors do not represent concepts. The vectors are math

If this community values good discussion, it should probably just ban statements that manage to be this wrong. It’s like when creationists say things like “if we came from monkeys why are they still around???”. The person has just demonstrated such a fundamental lack of understanding that it’s better to not engage.

Veraticus@lib.lgbt · edit-2 1 year ago

Oh, you again – it’s incredibly ironic you’re talking about wrong statements when you are basically the poster child for them. Nothing you’ve said has any grounding in reality, and is just a series of bald assertions that are as ignorant as they are incorrect. I thought you would’ve picked up on it when I started ignoring you, but: you know nothing about this and need to do a ton more research to participate in these conversations. Please do that instead of continuing to reply to people who actually know what they’re talking about.

bionicjoey@lemmy.ca · 1 year ago

The author is an imbecile if they haven’t been able to break GPT. It took me less than one day of tooling around with it before I got it to say something which outed it as having no understanding of what we were discussing.

mitchell@lemmy.ca · 1 year ago

Adam Something uploaded a video starting with the definition of intelligence itself, and then explains how something that “acts” intelligent doesn’t mean it “is” intelligent.

Veraticus@lib.lgbt · 1 year ago

I think even “intelligence” here is a stretch. In a very narrow sense, it is intelligent: it creates text, simulates conversations, answers questions. But that is not what intelligence is (and it is all LLMs can do).

notenoughbutter@lemmy.ml · 1 year ago

are you not an algorithm?

perfected over thousands of years?

Veraticus@lib.lgbt · 1 year ago

No? Humans are not algorithms except in the most general sense.

For example, there has not been any discovery of an algorithm that allows one to predict human actions, and scientists debate whether such a thing could even exist.