Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ChatGPT can't do better than experts in terms of analysis because it doesn't have a theory of mind. In other words, it's not actually thinking "what would experts do", and try to do the same or better. It's normal that it can produce mediocre results, because that's what it does, in a way: It produces "normal-looking" text. I don't know enough to say if it could do better if specialized (?), but I doubt it, as it cannot actually have an original idea, and knowing what is original is part of what makes an expert.


Theory of mind is super important here! Not just that of experts, but that of consumers. An actual author is always thinking about the audience. About what they'll find novel or interesting. About the best ways to inform, entertain, or delight them. About how to make them feel.

Merely producing "normal-looking" text (great and accurate phrase!) is an unclosed feedback loop. The songwriter here has to stop himself from driving that loop forward because he's so used to working through multiple drafts as he makes things work for his internal simulation of his audience. And generally, once an artist has something that works internally, they'll then start running it by actual other people to get their reactions. Up to and including testing material on full audiences.

Because LLMs lack internal simulations of audience response, they'll always be limited to producing "normal-looking" work.


"Theory of Mind May Have Spontaneously Emerged in Large Language Models"

https://arxiv.org/abs/2302.02083

'Theory of mind (ToM), or the ability to impute unobservable mental states to others, is central to human social interactions, communication, empathy, self-consciousness, and morality. We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training. Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT-3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children. These findings suggest that ToM-like ability (thus far considered to be uniquely human) may have spontaneously emerged as a byproduct of language models' improving language skills.'


I see a lot of confident assertions of this type (LLMs don’t actually understand anything, cannot be creative, cannot be conscious, etc.), but never any data to substantiate the claim.

This recent paper suggests that recent LLMs may be acquiring theory of mind (or something analogous to it): https://arxiv.org/abs/2302.02083v1

Some excerpts:

> We administer classic false-belief tasks, widely used to test ToM in humans, to several language models, without any examples or pre-training. Our results show that models published before 2022 show virtually no ability to solve ToM tasks. Yet, the January 2022 version of GPT- 3 (davinci-002) solved 70% of ToM tasks, a performance comparable with that of seven-year-old children. Moreover, its November 2022 version (davinci-003), solved 93% of ToM tasks, a performance comparable with that of nine-year-old children.

> Large language models are likely candidates to spontaneously develop ToM. Human language is replete with descriptions of mental states and protagonists holding divergent beliefs, thoughts, and desires. Thus, a model trained to generate and interpret human-like language would greatly benefit from possessing ToM.

> While such results should be interpreted with caution, they suggest that the recently published language models possess the ability to impute unobservable mental states to others, or ToM. Moreover, models’ performance clearly grows with their complexity and publication date, and there is no reason to assume that their it should plateau anytime soon. Finally, there is neither an indication that ToM-like ability was deliberately engineered into these models, nor research demonstrating that scientists know how to achieve that. Thus, we hypothesize that ToM-like ability emerged spontaneously and autonomously, as a byproduct of models’ increasing language ability.


> … but never any data to substantiate the claim

There is!

https://arxiv.org/abs/2301.06627

From the paper:

  > Based on this evidence, we argue that (1) contemporary LLMs should be taken seriously as models of formal linguistic skills; (2) models that master real-life language use would need to incorporate or develop not only a core language module, but also multiple non-language-specific cognitive capacities required for modeling thought.


Thanks for this, I’ll give it a read.


I find this interesting not from the perspective of LLMs but it seems to imply that human language being a prerequisite for self-awareness. Is that really so?


Can we think without language? - https://mcgovern.mit.edu/2019/05/02/ask-the-brain-can-we-thi...

> Imagine a woman – let’s call her Sue. One day Sue gets a stroke that destroys large areas of brain tissue within her left hemisphere. As a result, she develops a condition known as global aphasia, meaning she can no longer produce or understand phrases and sentences. The question is: to what extent are Sue’s thinking abilities preserved?

> Many writers and philosophers have drawn a strong connection between language and thought. Oscar Wilde called language “the parent, and not the child, of thought.” Ludwig Wittgenstein claimed that “the limits of my language mean the limits of my world.” And Bertrand Russell stated that the role of language is “to make possible thoughts which could not exist without it.” Given this view, Sue should have irreparable damage to her cognitive abilities when she loses access to language. Do neuroscientists agree? Not quite.

The Language of Thought Hypothesis - https://plato.stanford.edu/entries/language-thought/ (which has a long history going back to Augustine)

---

If 23 year old me were here now and considering future life paths, I'd be sorely tempted to be looking at declaring/finishing a dual CS/philos major and going to grad school.


Do these test imply that a theory or mind requires human language any more so than a mirror test implies that self-awareness requires eyeballs?


It's a reasonable hypothesis. Being good at calculating 'What would happen next if {x}' is a decent working definition of baseline intelligence, and the capabilities of language allow for much longer chains of thought than what is possible from a purely reactive approach.

Entering the realm of Just My Opinion, I wouldn't be surprised at all if internal theory of mind is simply an emergent property of increasing next-token prediction capability. At a certain point, you hit a phase change, where intelligence loops back on itself to include its own operation as part of its predictions, and there you go- (some form) of self-awareness.


This paper is worth a HN submission in its own right. Or did it have one already?


I came across this paper elsewhere, but it looks like someone posted it today: https://news.ycombinator.com/item?id=34756024

There was also a larger discussion from a few days ago: https://news.ycombinator.com/item?id=34730365




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: