I think a big part is which model seems to work better with your language/stack. My language is Elixir, which is somewhat niche, and only Claude has been able to produce usable Elixir code so far. None of the other things mentioned in the article mattered, because of this. I wonder if others have this experience that some models just struggle with some languages/stacks?
Hey Jose, author here! That's a great call out. I write predominantly in Swift and for a long time Claude was the only usable option. But sometime around GPT-5 OpenAI's models got much better at Swift, so the choice started becoming more about aesthetics (as a descriptor of preferences). So you're right — if the model can't write coherent code then it doesn't matter what kind of flow you feel as you're working with the tools — but I do imagine this will continue to improve for all languages including Elixir.
If it's okay to mention my own project, I'd appreciate it if you could check out https://charleswiltgen.github.io/Axiom/ (open source) and let me know what you think. It's focused on modern Swift, with specialized skills for helping developers get to strict Swift 6 concurrency.
I've only skimmed it since I'm between Christmas and a longer vacation that starts in 24 hours, but this actually looks really neat! I'll definitely take a closer look to at these skills in depth — but this is exactly the kind of thing I've been telling people to take the time to invest in for their agentic environments. :)