More

antonvs · 2026-05-04T22:43:10 1777934590

It’s organizations figuring out how to monetize all the way up.

antonvs · 2026-05-04T22:42:31 1777934551

> Anthropic acquired Bun for their own benefit, to protect and grow their investment in Claude Code.

I’m unclear about this. What’s the business case? I use Gemini CLI a lot, which runs on Node, and I can’t see anything that would be improved by using a different JS runtime. It’s not something you notice as a user. Node is mature, stable, and perfectly fit for the purpose.

If Anthropic were public and if these decisions were comprehensible to the average investor, an acquisition like this ought to cause the stock to plummet. Luckily for the people involved, there are no constraints like that in the current market.

antonvs · 2026-05-04T22:30:52 1777933852

I haven’t followed Docker’s case in particular, but how much investment was required to get it to that point? If it’s a case of “How do you become a millionaire? Start as a billionaire and invest in Docker”, then the perception may have some basis.

antonvs · 2026-05-04T22:26:00 1777933560

The source code of Claude Code and Gemini CLI contradict that.

cyanydeez · 2026-05-05T11:17:39 1777979859

Well sure, I can find you hundreds of dead ends in mutations, but when you start parsing through what a harness does, eventually it'll just be another more deterministic and constrained model to do less creative things.

antonvs · 2026-05-04T17:17:47 1777915067

Do you implement a DAG within your system to act as a kind of well-defined backbone for analysis and execution, or do you dispense with (explicit) DAGs entirely?

antonvs · 2026-05-04T16:47:12 1777913232

I've seen LLMs include that exact "production-ready" claim on code they generate. But of course it gets that from its training data.

antonvs · 2026-05-04T13:08:51 1777900131

I was reading the description trying to figure out what it actually does. I built remote k3s deployment over ssh into a product I worked on, and there really was very little to it. Shell into the machine, run the installer, set the config - and that last part is going to be unique to your situation anyway. It makes perfect sense that your setup got simpler after removing this.

antonvs · 2026-05-04T13:05:55 1777899955

I implemented a system that included the OP functionality (plus a whole lot more.) It was for on-premise deployment at customers. It can also be used to spin up stand-alone instances of our system in the cloud, for development, testing, etc. While you could, in theory, do many deployments on a single k8s cluster, there are some benefits to the automatic isolation you get from deploying on a standalone VM.

hdjrudni · 2026-05-05T05:02:13 1777957333

I'm doing many deployments in my single k8s cluster. I just put them each in a different namespace.

The only piece that's maybe a little dicey is the single load balancer/gateway. If there's a hiccup in that, then everything goes down.

But I've only blown up my cluster once in like 8 years or something, that's not too bad. It was a learning experience :-)

What other kinds of isolation do you want? I can see maybe a separate staging environment if you want to test gnarly things like that ahead of rolling them out to prod. And I guess maybe they can eat eachother's resources if you don't have request limits nor auto-scaling enabled.

But I'm cheap and managing more clusters sounds like a pain. Then I'd have to deal with more kubectl credentials and what not too.

antonvs · 2026-05-04T12:55:40 1777899340

LLMs are one of the most general abstractions possible.

LLMs are also quite deterministic if you want them to be - generally, their final token selection is deliberately randomized (the model “temperature”). But the word you’re looking for here is probably not actually determinism, it’s probably something closer to predictability.

In any case, it’s perfectly possible to ensure that the output of LLMs is fully deterministic, debuggable, understandable, and testable.

> You cannot be serious.

I don’t think you’re thinking about this clearly.

vrighter · 2026-05-05T08:35:10 1777970110

because it is not actually trying to prove anything about its outputs, setting the temperature to zero will just ensure it always makes the same mistake when "compiling" english into code.

A compiler simply always preserves semantics of its input. Even when randomness is used (ex. possibly during register allocation).

lionkor · 2026-05-04T13:44:58 1777902298

With a sufficiently complex prompt and a sufficiently complex codebase, LLMs consistently fail and make mistakes, "forget" parts of the prompt, etc.

There's no comparison to be made between this and, for example, a compiler. It's an incompetent comparison.

> I don’t think you’re thinking about this clearly.

My literal job is dealing with layers of abstraction. I'm thinking pretty clearly when I tell you that, not only are LLMs a super leaky, terrible abstraction, they are also not comparable to any other layers of abstraction. All other layers of abstraction we use are well understood, predictable (as you put it), and DEBUGGABLE.

When claude deletes a fix it did two weeks ago, while trying to fix some unrelated error, do you never stop and think "this is not quite the same as what GCC does"?

antonvs · 2026-05-04T16:44:06 1777913046

> With a sufficiently complex prompt and a sufficiently complex codebase ...

With a sufficiently complex specification of a failure mode, you can find problems with anything.

Humans, given sufficiently complex requirements and sufficiently complex codebases, also regularly fail. You're tacitly admitting that LLMs are approaching (if not exceeding) human levels of performance now. We somehow get non-deterministic humans to achieve useful work. In fact, staff provide managers with an abstraction over the work they're responsible for - managers don't know every detail of the systems they're responsible for.

There are effective ways to use LLMs. I recommend using those, not using overly complex prompts, and not letting LLMs freely make changes to large code bases. Just as compilers only compile one source file at a time, LLMs work best if you scope their attention. Same goes for humans, in fact.

> There's no comparison to be made between this and, for example, a compiler.

A simple comparison is that both can generate useful code. You need to be more precise about the issues you're trying to identify.

Anyway, the comparison to compilers isn't really the point. It's undeniable that LLMs are an abstraction themselves, and that they can generate new abstractions. Saying that they're "not another abstraction" is just definitionally wrong.

Sure, they're not the same kind of abstraction as a traditional compiler. They require new ways of working, but actually not that new, as the manager example I gave suggests.

> When claude deletes a fix it did two weeks ago, while trying to fix some unrelated error, do you never stop and think "this is not quite the same as what GCC does"?

I never made the mistake of thinking LLMs were the same as GCC in the first place.

And once again, I've seen human developers do exactly what you just described. That's why we review code. All the arguments you're making are essentially also arguments that humans shouldn't be involved in software development either.

Terr_ · 2026-05-04T17:25:33 1777915533

> LLMs are also quite deterministic if you want them to be

In the shallow sense that any PRNG is deterministic if you set the seed and if you control triggering order.

However that's not usually the situation/scope people are talking about.

antonvs · 2026-05-04T17:57:19 1777917439

I was just pointing out, in part, that the non-determinism is a choice, but I probably would have needed to go down a whole rabbit hole about exploration of search spaces etc.

My broader point is that it's not really the non-determinism that's an issue. What the other commenter seems to be looking for is something along the lines of repeatable correctness, where correctness is generally a requirement that the model doesn't have full access to. The non-determinism is an implementation detail here.

antonvs · 2026-05-04T12:43:18 1777898598

It’s not even close to at an end. Hardware would need to increase in cost by hundreds or even thousands of times to materially change that calculation.

Just as an example, the cost of one week of engineering time corresponds to tens of thousands of vCPU-hours, which is many years of CPU time.

As such, it only ever makes business sense to optimize code either when it has bottlenecks that can’t be fixed by throwing hardware at it, or when it’s so inefficient that it can be sped up by several orders of magnitude.