More

ludwik · 2026-06-06T19:16:48 1780773408

I like to dunk on Meta as much as the next guy, but I think this makes sense: deterministic verification like this is not, and should never be, the LLM’s job. The tools it has access to should enforce the permissions layer, ensuring that the LLM can never perform actions the user themselves should not be allowed to perform. In this case, the tool failed to do that.

TZubiri · 2026-06-06T20:25:38 1780777538

>deterministic verification like this is not, and should never be, the LLM’s job.

But when humans handled it, this was not as much as a problem. That is, the humans did the job, because they recognized the need to do that job.

Sure sometimes accounts could get recovered if a human was tricked, but evidently it was easier to trick the LLM in masse than humans.

ajross · 2026-06-06T20:53:40 1780779220

> But when humans handled it, this was not as much as a problem.

In fact it's arguably a feature. The ability of support staff to short-circuit nitpicky rules when there's an obvious external validation happening (e.g. you're on the phone with a user who's presenting ID in real time and correlating it with previous use of the account, etc...) makes for better data quality and happier customers.

Obviously, yes, you can then human-engineer an authentication breach. But that was very difficult, because people are "common-sense careful" in a way we haven't been able to tease out of AI yet.

ludwik · 2026-06-07T05:19:42 1780809582

Maybe that’s because I work with agentic AI in my day job, but this seems utterly obvious to me: no reasonable person would ever claim that LLMs are better at keeping secrets or enforcing rules than human employees.

This notice is not about comparing humans and LLMs. It seems that the system was designed in the only reasonable way: with a deterministic permissions layer separate from the agent. But that layer failed to work properly.

So the notice is comparing the difference between how the system was supposed to work and how it actually worked in reality. Normal post-mortem stuff.

gavmor · 2026-06-07T04:13:33 1780805613

The overall system that allowed this implementation is accountable. So why put such a fine point on it so as to exculpate the LLM?

im3w1l · 2026-06-07T05:05:09 1780808709

It helps set expectations for the fix. "The bug was in an external system that has now been fixed" means we it's probably fine going forward. "The LLM got tricked but we are gonna train it super hard not to do that again" means it will break again and again as people find new angles to convince it.

dbbk · 2026-06-07T21:43:37 1780868617

Yes the LLM part is irrelevant here. It'd be just the same if it was a HTML form.

ludwik · 2026-06-04T07:05:38 1780556738

I think this is exactly it, but let me ask another question (which is not rhetorical, I really don't know). Does the fact that one can describe what consciousness is and where it came from in humans help them to detect it in non-human and/or non-biological entities?

Nevermark · 2026-06-04T07:45:10 1780559110

That is a really good point. Yes, I think function is diagnosis on this.

Constant self-awareness, self-experience, self-focus, self-management, and self-improvement of one's own self (mind), is going to be an adaptive behavior for anything intelligent with resources to leverage. Whether truly independent, or highly motivated to serve others. The mind is the greatest tool.

I think that is more than simply a good functional definition of consciousness. How could all that integration and self-integration not be conscious.

ludwik · 2026-06-04T06:55:06 1780556106

And also "instilled during their reinforcement training", and we are currently pushing planning hard there, for autonomous agents.

trick-or-treat · 2026-06-04T07:19:44 1780557584

No I think reinforcement training would be an example of not innate. Don't you? That's like potty training.

ludwik · 2026-06-04T16:15:21 1780589721

Is it? Both supervised learning and reinforcement learning are ways of training the model, and the difference between them is not that big. I would say that innate means "in the weights", while non-innate means things the model learned during inference, during its "lifetime".

trick-or-treat · 2026-06-05T05:54:31 1780638871

Maybe you're right. In the weights might be the right way to frame that. What do you mean by "during its lifetime"? Do you mean things like system prompts or things in Claude.md?

It sounds like you're framing a session as a "lifetime". Whch might be right, I haven't thought of it like that before though. So when I /compact my session what's that even the equivalent of I wonder.

ludwik · 2026-06-06T19:20:49 1780773649

> Do you mean things like system prompts or things in Claude.md?

All of it - system prompts, user prompts, few-shot examples, Claude.md, things that an agent learned by exploring its environment...

> So when I /compact my session what's that even the equivalent of I wonder.

Sleep? :)

ludwik · 2026-05-16T22:05:30 1778969130

This was said in the context of a person predicting a stock market bust, so of course a stock market index price is the relevant number here.

Muromec · 2026-05-16T22:48:26 1778971706

Yeah, fair. I still can't shake off the nagging feeling most of it being a scam somehow and not the business as usual scam. The gut feeling that things proclaimed and observed don't add up.

ludwik · 2026-05-04T05:03:38 1777871018

It’s obviously not a new model capability. But using this well-known, existing capability to solve this particular issue is only obvious after the fact.

It’s a useful trick to have in one’s toolbox, and I’m grateful to the author for sharing it.

ludwik · 2026-05-03T17:44:23 1777830263

Performing 40 songs in exchange for a property does seem like serious effort...

ludwik · 2026-03-30T07:13:47 1774854827

The top comment categorized scraping as abuse ("abuse such as [...] scraping") - that's precisely why some accuse its author of lack of self awareness.

ludwik · 2026-02-15T14:57:15 1771167435

Perhaps LLMs have mimicked the style because authors have popularized it and clearly it serves some benefit to readers.

evanjrowley · 2026-02-15T19:13:44 1771182824

It's a cycle.

ludwik · 2026-01-19T06:59:03 1768805943

Shouldn't the code say:

    position = (position + direction + 1) % 12;

Or have I misunderstood something?

LiamPowell · 2026-01-19T07:05:16 1768806316

The +12 is to keep the number positive. The direction contains the movement so a +1 wouldn't make sense.

nulptr · 2026-01-19T07:56:17 1768809377

The +12 there is so that % works correctly (ie the number never becomes negative)

ludwik · 2025-12-25T23:12:16 1766704336

> Just had the sitename put into the value of the cookie since, and never really needed to think about that.

How would that help? This doesn't seem like a solution to the CSRF problem