I like to dunk on Meta as much as the next guy, but I think this makes sense: deterministic verification like this is not, and should never be, the LLM’s job. The tools it has access to should enforce the permissions layer, ensuring that the LLM can never perform actions the user themselves should not be allowed to perform. In this case, the tool failed to do that.
> But when humans handled it, this was not as much as a problem.
In fact it's arguably a feature. The ability of support staff to short-circuit nitpicky rules when there's an obvious external validation happening (e.g. you're on the phone with a user who's presenting ID in real time and correlating it with previous use of the account, etc...) makes for better data quality and happier customers.
Obviously, yes, you can then human-engineer an authentication breach. But that was very difficult, because people are "common-sense careful" in a way we haven't been able to tease out of AI yet.
Maybe that’s because I work with agentic AI in my day job, but this seems utterly obvious to me: no reasonable person would ever claim that LLMs are better at keeping secrets or enforcing rules than human employees.
This notice is not about comparing humans and LLMs. It seems that the system was designed in the only reasonable way: with a deterministic permissions layer separate from the agent. But that layer failed to work properly.
So the notice is comparing the difference between how the system was supposed to work and how it actually worked in reality. Normal post-mortem stuff.
It helps set expectations for the fix. "The bug was in an external system that has now been fixed" means we it's probably fine going forward. "The LLM got tricked but we are gonna train it super hard not to do that again" means it will break again and again as people find new angles to convince it.
I think this is exactly it, but let me ask another question (which is not rhetorical, I really don't know). Does the fact that one can describe what consciousness is and where it came from in humans help them to detect it in non-human and/or non-biological entities?
That is a really good point. Yes, I think function is diagnosis on this.
Constant self-awareness, self-experience, self-focus, self-management, and self-improvement of one's own self (mind), is going to be an adaptive behavior for anything intelligent with resources to leverage. Whether truly independent, or highly motivated to serve others. The mind is the greatest tool.
I think that is more than simply a good functional definition of consciousness. How could all that integration and self-integration not be conscious.
Is it? Both supervised learning and reinforcement learning are ways of training the model, and the difference between them is not that big. I would say that innate means "in the weights", while non-innate means things the model learned during inference, during its "lifetime".
Maybe you're right. In the weights might be the right way to frame that. What do you mean by "during its lifetime"? Do you mean things like system prompts or things in Claude.md?
It sounds like you're framing a session as a "lifetime". Whch might be right, I haven't thought of it like that before though. So when I /compact my session what's that even the equivalent of I wonder.
Yeah, fair. I still can't shake off the nagging feeling most of it being a scam somehow and not the business as usual scam. The gut feeling that things proclaimed and observed don't add up.
It’s obviously not a new model capability. But using this well-known, existing capability to solve this particular issue is only obvious after the fact.
It’s a useful trick to have in one’s toolbox, and I’m grateful to the author for sharing it.
The top comment categorized scraping as abuse ("abuse such as [...] scraping") - that's precisely why some accuse its author of lack of self awareness.
reply