> Do you think it's really a training set problem? I don't think you learn to sa...

> Do you think it's really a training set problem? I don't think you learn to say that you don't understand by observing people say it, you learn to say it by being introspective about how much you have actually comprehended, understanding when your thinking is going in multiple conflicting directions and you don't know which is correct, etc.

I really do think it's a training set problem. It's been amply proven that the models often do know when they lie.

Sure, that's not how children learn to do this... is it? I think in some cases, and to some degree, it is. They also learn by valuing consistency and separately learning morals. LLMs also seem to learn morals to some degree, but to the degree they're even able to reason about consistency, it certainly doesn't feed back into their training.

---

So yeah, I think it's a training set issue, and the reason children don't need this is because they have capabilities the LLMs lack. This would be a workaround.