Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So... this would be fine with them?

Claude: "Are you sure you want me to commit murder?"

User: "Yes"

Or do you mean Human presses button:

Claude: "Do you to commit murder? If so press the button."

User: "I pressed the button"

Claude: "Great! Now lets summarize what we did."

 help



First one

Seems like an absurd distinction to me... Reminds me of "I was just following orders"...

I mean the distinction doesn't really matter

There are many ways to construct HITL UXes. But typically they'd take the form of the first one

I think you're missing the forest for the trees. All Anthropic is saying is that HITL is required before murder, the UX is irrelevant


I agree the distinction doesn't matter, but im not so sure "just" having a human in the loop qualifies as an ethical stand. Just because your not pulling the trigger doesn't make you not culpible for the outcome.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: