More

einrealist · 2026-03-07T07:55:09 1772870109

> SQLite is not primarily fast because it is written in C. Well.. that too, but it is fast because 26 years of profiling have identified which tradeoffs matter.

Someone (with deep pockets to bear the token costs) should let Claude run for 26 months to have it optimize its Rust code base iteratively towards equal benchmarks. Would be an interesting experiment.

The article points out the general issue when discussing LLMs: audience and subject matter. We mostly discuss anecdotally about interactions and results. We really need much more data, more projects to succeed with LLMs or to fail with them - or to linger in a state of ignorance, sunk-cost fallacy and supressed resignation. I expect the latter will remain the standard case that we do not hear about - the part of the iceberg that is underwater, mostly existing within the corporate world or in private GitHubs, a case that is true with LLMs and without them.

In my experience, 'Senior Software Engineer' has NO general meaning. It's a title to be awarded for each participation in a project/product over and over again. The same goes for the claim: "Me, Senior SWE treat LLMs as Junior SWE, and I am 10x more productive." Imagine me facepalming every time.

grey-area · 2026-03-07T14:03:10 1772892190

This would be a really interesting experiment.

I suspect performance is not the only problem with the codebase though.

einrealist · 2026-02-22T01:28:48 1771723728

If the output has problems, do you usually rerun the compilation with the same input (that you control)? I don't usually.

What is included in the 'verify' step? Does it involve changing the generated code? If not, how do you ensure things like code quality, architectural constraints, efficiency and consistency? It's difficult, if not (economically) impossible, to write tests for these things. What if the LLM does not follow the guidelines outlined in your prompt? This is still happening. If this is not included, I would call it 'brute forcing'. How much do you pay for tokens?

bandrami · 2026-02-22T02:14:07 1771726447

I thought to myself that I do this pretty frequently, but then I realized only if I'm going from make -j8 to make -j1. I guess parallelism does throw some indeterminancy into this

eichin · 2026-02-22T02:31:54 1771727514

If parallelism adds indeterminacy, then you have a bug (probably in working out the dependency graph.) Not an unusual one - lots of open source in the 1990s had warnings about not building above -j1 because multi-core systems weren't that common and people weren't actually trying it themselves...

bandrami · 2026-02-22T03:01:09 1771729269

Whenever I traced them, those bugs were always in the logic of the makefile rather than in the compiler. A target in fact depends on another target (generally from much earlier in the file) but the makefile doesn't specify that.

fragmede · 2026-02-22T02:18:42 1771726722

The time I was able to make -j 128 and it took 3 minutes to do what used to take an hour, I almost wet myself.

einrealist · 2026-02-21T15:08:04 1771686484

Why not show a summary of who actually received the data? It should be easy to implement. You could also add what data is retained and an estimate of how long it is kept for. It could be a summary page that I can print as a PDF after the process is complete.

I'd consider that a feature that would increase trust in such a platform. These platforms require trust, right?

einrealist · 2026-02-17T11:21:26 1771327286

What is the purpose of an AGENTS.md file when there are so many different models? Which model or version of the model is the file written for? So much depends on assumptions here. It only makes sense when you know exactly which model you are writing for. No wonder the impact is 'all over the place'.

einrealist · 2026-02-16T23:26:43 1771284403

I can follow the arguments, and I find many of them plausible. But LLMs are still unreliable and require attention and verification. Ultimately, it's an economic question: the cost of training the model and the computing power required to produce accurate results.

The strongest argument is the one about the interface. LLMs will definitely have a large impact. But under the hood, I still expect to see a lot of formally verified code, written by engineers with domain knowledge, with support by AI.

einrealist · 2026-02-11T10:17:19 1770805039

Reminds me of this: https://www.nbcnews.com/news/world/russia-plot-plant-bombs-c...

El Paso is a hub for cargo. Probably takes some days to go through all that parcel.

einrealist · 2026-02-05T18:10:17 1770315017

They trained for it. That's the +0.1!

einrealist · 2026-02-05T12:46:51 1770295611

I am curious to know what he has in mind. This 'process engineering' could be a solution to problems that BPM and COBOL are trying to solve. He might end up with another formalized layer (with rules and constraints for everyone to learn) of indirection that integrates better with LLM interactions (which are also evolving rapidly).

I like the idea that 'code is truth' (as opposed to 'correct'). An AI should be able to use this truth and mutate it according to a specification. If the output of an LLM is incorrect, it is unclear whether the specification is incorrect or if the model itself is incapable (training issue, biases). This is something that 'process engineering' simply cannot solve.

reg_dunlop · 2026-02-05T14:23:14 1770301394

I'm also curious about what a process engineering abstraction layer looks like. Though the final section does hint at it; more integration of more stakeholders closer to the construction of code.

Though I have to push back on the idea of "code as truth". Thinking about all the layers of abstraction and indirection....hasn't data and the database layer typically been the source of truth?

Maybe I'm missing something in this iteration of the industry where code becomes something other than what it's always been: an intermediary between business and data.

einrealist · 2026-02-05T14:44:03 1770302643

Yes, the database layer and the data itself are also sources of truth. Code (including code run inside the database, such as SQL, triggers, stored procedures and other native modules) defines behaviour. The data influences behaviour. This is why we can only test code with data that is as close to reality as possible, or even production data.

einrealist · 2026-02-03T07:04:02 1770102242

Why is it too big to fail? SpaceX can be dissected, parts be sold to the government or the competition.

It's too big to fail for Musk, because it is one source of his money, in large paid by the US tax payer.

Ekaros · 2026-02-03T09:20:13 1770110413

I see no reason why Starship could not be dumped. And Falcon rockets kept being produced as needed, maybe with higher cost.

einrealist · 2026-01-28T21:36:12 1769636172

Unless the economy crashes and I die to the consequences, there are so many pre-AI hard-cover books to read.....