Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The individual person that pressed the "go" button (if there was a person), is going to henceforth be __the best__ DevOps person to ever have on your team. They have learned a multi-trillion-dollar lesson that no amount of training could have prepared them for.

And the Crowdstrike CTO has either been given the ammunition to get __whatever they ask for, ever again__ with regard to appropriate allocation of resources for devops *or* they'll be fired (whether or not it's their fault).

And let me be very clear. This is absolutely, positively and wholly not the person that pressed the button's fault. Not even a little. At a company as integral as CrowdStrike, the number of mistakes and errors that had to have happened long before it got to "Joe the Intern Press Button" is huge and absurd. But many of us have been in (a much, much, *MUCH* smaller version of) Joe's shoes, and we know the gut sinking feeling that hits when something bad happens. A good company and team won't blame Joe and will do everything they can to protect Joe from the hilariously bad systemic issues that allowed this to happen.



Or maybe you get someone with PSTD and suicidal tendencies. You never know how someone will process something like this.


> not the person that pressed the button's fault

You know that, and I know that. The people who will ruin his life starting today do not know (or care).


This is why it is the responsibility (yes, responsibility) of every one of their coworkers, especially those more senior than them, to fight *HARD* to protect them.

This is part of the job of a senior.


"We must hang together, or we will all hang separately" is a lesson that I don't think programmers will ever learn.


It is not a human error, it is a process that has to be improved. Humans make mistakes, that is why we have processes in place.


Basic training could've taught him how not to do YOLO global rollouts, and while the stress of this mistake will make him remember a lot, given the lack of basic knowledge that would've prevented this, this lesson will not be very valuable


Absolutely it's a failure of process. But sometimes people just don't pay attention. Hiring inobservant or reckless people is a risk multiplier.


Crowdstrike will not exist after this is over.


That's very optimistic.


I literally just dumped 30 switches yesterday across an entire facility and had to walk 30 closets by foot to recover from ROMMON.

Shit happens. We learn.


Can you explain what went wrong?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: