> I would really like to see professional, established coach running around with young prodigies on a peak of their biology.
This is a really strange nit. You are aware it's an analogy about skill and role. To reduce this to being about biology and the impacts of senescence on ability is weird, and doesn't really apply here.
Analogies have to make sense, to be applicable. In this case it doesn't.
E.g. you can't just spew nonsense like "let's work together like a bee hive, everything for the Queen/CEO, no matter the personal cost to an individual" without others pointing out the stupidity of comparing humans with bees.
You can't just come up with a desirable adjective and start coming up with random scenarios in which those characteristics may occur. "Let's make the company strong as a gorilla, big as an elephant, smart as Von Neumann, bright as a Sun, as courageous as young guys from youtube fails compilations." This makes no sense whatsoever.
It makes plenty of sense. Player-coaches are a real thing, and in a realm where you're not worried about peak fitness then it's reasonable to demand the coaches become player-coaches.
Player-coaches are a real thing, but noticeable because of how rare and unusual they are. The problem is that the analogy doesn't even hold up in the source its referring to.
Sure, there are good player-coaches, but there are also great pure leaders. There are also very bad player-coaches. A coach who is trying too hard and too deep to be a player when they are less "fit" (or skilled) has historically led to many problems in many cases
It's not a deep analogy. It's not saying player coaches are inherently better, but in their particular situation they want the managers to be coding.
There's not much equivalent to "fit" here, just skill, and they decided they don't want the pure leaders, they want ones that are knuckle deep in the sausage.
Good decision or not, that very basic analogy is completely fine.
This comment doesn't add anything novel to the discussion, but is worth adding I think because hubbers and MSFT folks read HN - I too am evaluating leaving personally. Professionally, we're talking about it loosely, and if it continues it will become an increasing likelihood.
Have you run a system in production? There are a multitude of reasons that a system can go down. There's no indication so far from Anthropic that this was merely compute limitations.
> There are a multitude of reasons that a system can go down.
Start doing post mortems then!
At the very least, them using any off the shelf service that's shitting the bed would inform others to stay away from it - like an IAM solution, or maybe a particular DB in a specific configuration backing whatever they've written, or a given architecture for a given scale.
Right now it's completely like a black box that sometimes goes down and we don't get much information about why it's so much less stable than other options (hey, if they just came out and said "We're growing 10x faster than we anticipated and system X, Y and Z are not architected for that." that'd also be useful signal).
Or, who knows, maybe it's just bad deploys - seems like it's back for me and claude.ai UI looks a bit different hmmm.
I have no inside knowledge of Anthropic. But having done a lot of postmortems in general, one of the key dynamics that routinely comes up is "we know we keep shipping breakages, and we know these new procedures would prevent many of them, but then we wouldn't be able to deliver new stuff so quickly". Given where Anthropic is at and what they believe about the future of software development, that's a tradeoff that they may very well be intentionally not making.
Yeah, this is not just inference. First thing for me was an MCP I use went down in Claude Code, models still worked. Now "API Error: 529 Authentication service is temporarily unavailable."
Why does the system work like that? Is the cache local, or on Claude's servers?
Why not store the prompt cache to disk when it goes cold for a certain period of time, and then when a long-lived, cold conversation gets re-initiated, you can re-hydrate the cache from disk. Purge the cached prompts from disk after X days of inactivity, and tell users they cannot resume conversations over X days without burning budget.
The cache is on Antropics server, its like a freeze frame of the LLM inner workings at the time. the LLM can pick up directly from this save state. as you can guess this save state has bits of the underlying model, their secret sauce. so it cannot be saved locally...
Maybe they could let users store an encrypted copy of the cache? Since the users wouldn't have Anthropic's keys, it wouldn't leak any information about the model (beyond perhaps its number of parameters judging by the size).
I'm unsure of the sizes needed for prompt cache, but I suspect its several gigs in size (A percentage of the model weight size), how would the user upload this every time they started a resumed a old idle session, also are they going to save /every/ session you do this with?
A few gigs of disk is not that expensive. Imo they should allocate every paying user (at least) one disk cache slot that doesn't expire after any time. Use it for their most recent long chat (a very short question-answer that could easily be replayed shouldn't evict a long convo).
I don't know how large the cache is, but Gemini guessed that the quantized cache size for Gemini 2.5 Pro / Claude 4 with 1M context size could be 78 gigabytes. ChatGPT guessed even bigger numbers. If someone is able to deliver a more precise estimate, you're welcome to :-).
So it would probably be a quite a long transfer to perform in these cases, probably not very feasible to implement at scale.
Whats lost on this thread is these caches are in very tight supply - they are literally on the GPUs running inference. the GPUs must load all the tokens in the conversation (expensive) and then continuing the conversation can leverage the GPU cache to avoid re-loading the full context up to that point. but obviously GPUs are in super tight supply, so if a thread has been dead for a while, they need to re-use the GPU for other customers.
They could let you nominate an S3 bucket (or Azure/GCP/etc equivalent). Instead of dropping data from the cache, they encrypt it and save it to the bucket; on a cache miss they check the bucket and try to reload from it. You pay for the bucket; you control the expiry time for it; if it costs too much you just turn it off.
Encryption can only ensure the confidentiality of a message from a non-trusted third party but when that non-trusted third party happens to be your own machine hosting Claude Code, then it is pointless. You can always dump the keys (from your memory) that were used to encrypt/decrypt the message and use it to reconstruct the model weights (from the dump of your memory).
jetbalsa said that the cache is on Anthropic's server, so the encryption and decryption would be server-side. You'd never see the encryption key, Anthropic would just give you an encrypted dump of the cache that would otherwise live on its server, and then decrypt with their own key when you replay the copy.
This is also an oversimplification. If I understand the issue correctly, the notification with the message contents was what was cashed locally and then accessed. This same vulnerability would exist with Signal if you had the notifications configured to display the full message contents. In this case, it has nothing to do with either Apple or Signal.
This is a really strange nit. You are aware it's an analogy about skill and role. To reduce this to being about biology and the impacts of senescence on ability is weird, and doesn't really apply here.
reply