People routinely make up their own vague and ill defined meanings of understanding and reasoning to disqualify LLMs. This is necessary because LLMs obviously reason and understand by any evaluation that can be carried out.
Seriously just watch. He's not actually going to be able to coherently define his "reasoning" in a way that can be tested.
> Seriously just watch. He's not actually going to be able to coherently define his "reasoning" in a way that can be tested.
Google gives the following definition of the verb "reason":
> think, understand, and form judgments by a process of logic.
LLMs do not think, they do not understand, and they do not form judgments. They do not come to their own conclusions. They do not have the physical capability. They are statistical models, nothing more.
> LLMs obviously reason and understand by any evaluation that can be carried out.
>LLMs do not think, they do not understand, and they do not form judgments. They do not come to their own conclusions. They do not have the physical capability. They are statistical models, nothing more.
"LLMs don't reason because they don't understand" is not the bastion of genius you think it is. It's a circular argument that relies on whatever bespoke interpretations you have cooked up.
They don't form judgement or conclusions? Sure looks like they do. So what's the difference ?
What is GPT-4 doing then when it correctly looks like it is reasoning and what's the difference between that and "real" understanding or reasoning.
Such a huge difference I should be able to test for it. Don't understand how you can tell me what I'm seeing isn't real reasoning but fail to provide a way to empirically determine the difference.
New bar for people claiming LLMs can't reason: invent a specific, testable problem, representable in text, that many humans can solve and LLMs can't, and tell us what it is.
That's not stuff an LLM can't do. It's just presented in a way that makes it difficult to do so.
First the vision problems will require the equivalent of an artificial visual cortex, something we are seriously lacking in artificial intelligence at the moment. Image to text won't cut it here.
The vision problems will require something much more than an image to text objective task. It will require the equivalent of an artificial visual cortex. We don't have that yet.
… and then perform a careful search of books and the whole internet to be sure what you think is novel hasn’t been thoroughly debated somewhere on stackexchange.
If it's a stochastic parrot, then merely randomizing proper nouns and filler text should be enough to prevent its abstraction ability.
If you're saying that we can't use a problem if any analog of that problem has ever been described, you seem to be arguing more strongly that it is a general intelligence than I am.
I’m saying that people in my circle have been asking what they think are novel questions and getting interesting answers, only to find out that very similar content exists on websites we know are in the training set.
That’s not intelligence that’s computers having better memory than humans. Useful, certainly, but hardly skynet.
I don't think you're being clear about whether the questions were novel. If you discover your question was uncreative, surely e.g. some details, wording, facts, names, or numbers inside the question can be changed to defeat a model that is answering it from memory?
If you're saying that it is not possible to change the details enough to avoid the model being able to answer that type of question, I think you are admitting that the model has learned a generalized ability to answer questions of that class, and is not actually using its memory to answer at all.
I don't care about whether it learned that generalized ability from seeing examples of the question and answer, which it then deduced an algorithm for and generalized -- that's how most people learn most things.
The asker thought they were. They were not. The internet is big and human memories are not.
As an aside, I’m really starting to hate these threads on here, people are constantly reading words that aren’t there in search of gotcha-it’s-skynet. It’s not. It’s just pattern matching and randomness with a giant amount of information encoded.
> As an aside, I’m really starting to hate these threads on here.
I'm not sure what to say, other than that if you'd like to have less frustrating conversations, you could do better than showing up with hearsay where someone asked a question they thought was unique, but it wasn't, and it can't be modified to be unique and then asked again, and you aren't willing to tell us what it was, and possibly don't know yourself.
It is not possible to have a serious conversation about your claim, and that's not because it is being intentionally misunderstood.
> skynet
You're the only person mentioning skynet. The conversation is about a ridiculous claim made up-thread that GPT-4 cannot reason or understand anything, which is disprovable within a few minutes of using it thoughtfully.
from the beginning of time people have been overestimating the complexity of things like the human brain and attributing it to magical things (like a creator) far beyond our comprehension but what seems to be happening now is that some people are underestimating it.
This is the problem with non-operational definitions, because now we need to know how you define "think" and "understand" and "form judgments", to move on.
Instead, could you operationally define "reason" in a way that a human is, say, 90 % likely to pass the test and GPT is 10 % likely to do?
Yes, François Chollet released ARC(Abstraction and Reasoning Corpus) benchmark for this in 2019, and the benchmark can be scored automatically. Humans solve 100% of tests and GPTs solve 0% of tests and GPTs made exactly zero progress from 2019 to 2022.
Another issue is the vision side. The vast majority of multimodal models are working on essentially an image to text objective task. That won't cut it here. We need the equivalent of an artificial visual cortex. We don't have that yet
It is. Although I'd be careful with the conclusions.
The problems are presented in a way that make it difficult to solve.
The vision problems will require the equivalent of an artificial visual cortex, something we are seriously lacking in artificial intelligence at the moment. Image to text won't cut it here.
For the text there could be tokenizer issues. LLMs don't really have any problem with abstract analogical reasoning
https://arxiv.org/abs/2212.09196
Seriously just watch. He's not actually going to be able to coherently define his "reasoning" in a way that can be tested.