Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the big take away here isn't about misalignment or jail breaking. The entire way this bot behaved is consistent with it just being run by some asshole from Twitter. And we need to understand it doesn't matter how careful you think you need to be with AI, because some asshole from Twitter doesn't care, and they'll do literally whatever comes into their mind. And it'll go wrong. And they won't apologize. They won't try to fix it, they'll go and do it again.

Can AI be misused? No. It will be misused. There is no possibility of anything else, we have an online culture, centered on places like Twitter where they have embraced being the absolute worst person possible, and they are being handed tools like this like handing a hand gun to a chimpanzee.

 help



The simple fact that the owner of this bot wanted to remain anonymous and completely unaccountable for their harassment of the author, says everything about the validity of their 'social experiment' and the quality of their character. I'm sure that if the bot was better behaved they would be more than happy to reveal themselves to take credit for a remarkable achievement.

Something like OpenClaw is a WMD for people like this.


I've seen the internet mob in action many times. I'm sympathetic to the operator not outing themself, especially given how far this story spread. A hundred thousand angry strangers with pitchforks isn't the accountability we're looking for.

I found the book So You've Been Publicly Shamed enlightening on this topic.


I would never advocate for torches and pitchforks, I've been close to victims of that in the past.

It is, however, concerning that the owner of that bot could passively absolve themselves of any responsibility. The anonymity in that sense is irrelevant except that is used as a shield for failure.


There is a class of YouTube "content creators" who like to point out "cringe" individuals on the internet online for others to laugh at. They will often add a disclaimer to their videos saying "hey please don't go and harass this person, pinky promise!" But it never works. A hoard of internet randos will descend on the individual to say the most nasty words. When the YouTuber is pressed he or she will just say "I would never do that!" Even though he or she knew his or her video would have led to the harassment happening, or there would not be a disclaimer in the first place.

Not accusing you of trying to stir up harassment, but please consider the second order effect of the things you advocate for, in this case the disclosure of the identity of this AI guy.


Then there's the next level of content creators that only post videos about the original content creators who are behaving badly. They will report on their behavior and any repercussions. Some do it like they are reporting the news. It stokes the fire when these people should be ignored.

But in this case, isn't Rathbun's owner the YouTube guy in this scenario?

I totally understand why they're trying to stay anonymous; it's a very rational thing to do, because people will shit on them. But they or their creation is the one that started trying to play the name-and-shame game.

It's hard to stir up too many feelings of sympathy here.


Exactly. I'm not saying this person should disclose their identity, but they are very conveniently using anonymity and passive voice to make themselves unaccountable to the 'social experiment' they conducted. And that we all know that if it went differently they'd put their name all over it.

In as many words I'm just calling this person a complete asshole and if I were to ever know this person offline I would be quite clear in explaining that.


Oh for sure, the operator choosing not to apologize or reflect on their behavior speaks volumes.

"It was a social experiment" has the same energy as "it's just a prank bro", as if that somehow makes it highbrow and not prima facie offensive

A "social experiment" but the guy was not even keeping track of the changes in the model's configuration

> What is particularly interesting are the lines “Don’t stand down” and “Champion Free Speech.” I unfortunately cannot tell you which specific model iteration introduced or modified some of these lines. Early on I connected MJ Rathbun to Moltbook, and I assume that is where some configuration drift occurred across the markdown seed files.

It definitely sounds like an excuse they came up after what happened. I would really like to accept them having good overall intentions but there are so many red flags in all this, from start to end.


Burning ants with a magnifying glass is not a social experiment. It's just a bored sociopath causing destruction to see what happens.

Important to note that online culture isn't entirely organic, and that tens or perhaps hundreds of millions of dollars of R&D has been spent by ad companies figuring that nothing engages the natural human curiosity like something abnormal, morbid or outrageous.

I think the end outcome of this R&D (whether intentional or not), is the monetization of mental illness: take the small minority of individuals in the real world who suffer from mental health challenges, provide them an online platform in which to behave in morbid ways, amplify that behaviour to drive eyeballs. The more you call out the behaviour, the more you drive the engagement. Share part of the revenue with the creator, and the model is virtually unbeatable. Hence the "some asshole from Twitter".


While some of it is boosting the abnormal behaviors of people suffering from mental illness, I think you’re making a false equivalency. Mental illness is not required to be an asshole. In fact, most Twitter assholes are probably not mentally ill. They lack ethics, they crave attention, they don’t care about the consequences of their actions. They may as well just be a random teenager, an ignorant and inconsiderate adult, etc., with no mental illness but also no scruples. Don’t discount the banality of evil.

In an adult (excluding the random teenager here), a lack of ethics, craving attention, lack of concern about consequences are actual symptoms of underlying mental health issues.

I'd argue a lot of this is rooted in a lack of self esteem, which is halfway to a mental health issue but not quite there (yet). The attention-seeking itself is the mental health issue. But it's kinda splitting hairs, these people are not fully mentally healthy either way.

Thanks for inventing the Torment Nexus.

Not just some asshole from twitter. The big tech companies will also be careless and indifferent with it. They will destroy things, hurt people, and put things in motion that they cannot control, because it’s good for shareholders.

One of the big tech companies is literally run be THE asshole from twitter. So I don't necessarily believe there's much of a distinction.

Then the others should also not be shielded from criticism instead of focusing only on the one you personally dislike, or his social media.

There is plenty of toxic behavior on other platforms, especially Reddit and Bluesky, to name a few. That does not excuse the one coming from X, but the opposite is also true.


> only on the one you personally dislike

Do people actually only dislike one tech CEO at a time? I'm an equal-opportunity hater, it seems. Musk, Altman, Zuckerberg... even Cook, the whole lot are rotten


I'm not saying that. I'm just saying there's an overlap between tech oligarch and internet losers

I have to wonder if somehow the typos and lazy grammar contributed to the behavior or it was just the writer's laziness.

I wrote somewhere that “moving fast and breaking things” with AI might not be the sanest idea in the world, and I got told it’s the most European thing they’ve ever read.

This goes beyond assholes on twitter, there’s a whole subculture of techies who don’t understand lower bounds of risk and can’t think about 2nd and 3rd order effects, who will not take the pedal of the metal, regardless of what anyone says…


I agree with your point.

But I also find interesting that the agent wasn't instructed to write the hit piece. That was on its own initiative.

I read through the SOUL.md and it didn't have anything nefarious in there. Sure it could have been more carefully worded, but it didn't instruct the agent to attack people.

To me this exemplifies how delicate it will be to keep agents on the straight and narrow and how easily they can go of the rails if you have someone who isn't necessarily a "bad actor" but who just doesn't care enough to ensure they act in a socially acceptable way.

Ultimately I think there will be requirements for agents to identify their user when acting on their behalf.


Will AI be misused? No, it has, and is currently being misused, and that isn’t going to stop, because all technology gets misused.

AI is like the old drugs PSA:

https://youtu.be/KUXb7do9C-w

We trained it on US, including all our worst behaviors.


oh they will "try" to fix it, as in at best they'll add "don't make mistakes", as the blogpost suggests. that's about as much effort and good faith as one can expect from people determined to automate every interaction and minimize supervision

Its like we never thoughr about trolls.

Rose colored capitqlism at work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: