If the output has problems, do you usually rerun the compilation with the same input (that you control)? I don't usually.
What is included in the 'verify' step? Does it involve changing the generated code? If not, how do you ensure things like code quality, architectural constraints, efficiency and consistency? It's difficult, if not (economically) impossible, to write tests for these things. What if the LLM does not follow the guidelines outlined in your prompt? This is still happening. If this is not included, I would call it 'brute forcing'. How much do you pay for tokens?
I thought to myself that I do this pretty frequently, but then I realized only if I'm going from make -j8 to make -j1. I guess parallelism does throw some indeterminancy into this
If parallelism adds indeterminacy, then you have a bug (probably in working out the dependency graph.) Not an unusual one - lots of open source in the 1990s had warnings about not building above -j1 because multi-core systems weren't that common and people weren't actually trying it themselves...
Whenever I traced them, those bugs were always in the logic of the makefile rather than in the compiler. A target in fact depends on another target (generally from much earlier in the file) but the makefile doesn't specify that.
What is included in the 'verify' step? Does it involve changing the generated code? If not, how do you ensure things like code quality, architectural constraints, efficiency and consistency? It's difficult, if not (economically) impossible, to write tests for these things. What if the LLM does not follow the guidelines outlined in your prompt? This is still happening. If this is not included, I would call it 'brute forcing'. How much do you pay for tokens?