First of all, I don't want to run anyone's code without proper explanation, so help me understand this.
Let's start with the verifier. The 3rd party verifier receives a bundle, not knowing what the content is, not having access to the tool used to measure, and just run a single command based on the bundle which presumably contains expected results and actual measurements, both of which can easily be tampered. What good does that solve?
Right question. Bundle alone proves nothing — you're correct.
Two things make it non-trivial to fake:
The pipeline is public. You can read scripts/steward_audit.py
before running anything. It's not a black box.
For materials claims — the expected value isn't in the bundle.
Young's modulus for aluminium is ~70 GPa. Not my number.
Physics. The verifier checks against that, not against
something I provided.
ML and pipelines — provenance only, no physical grounding.
Said so in known_faults.yaml :: SCOPE_001.
Claude + Cursor wrote the structure. I fixed hundreds of
errors — wrong tests, broken pipelines, docs that didn't
match the code. That's literally why the verification
layer exists. AI gets it wrong constantly.
This comment — also Claude, on my direction. That's the
point. Tool, not author.
It's surprising that AI coding agents have network effects but it's true. Think about it from first principles & you'll realize that the bottleneck is how many people are using it to write real code & providing both implicit (compiler errors, test failures, crash logs, etc) & direct ("did not properly follow instructions", "deleted main databases", "didn't properly use a tool", etc) feedback. No one is using xAI for serious software engineering so that leaves OpenAI, Anthropic, & Google w/ enough scale to benefit from network effects. No one has real AI but what they do have is the appearance of intelligence from crowdsourced feedback & filtering. This means companies that are already in the lead will continue to stay there & xAI started way too late so they will continue to lose in every domain that actually matters & benefits from network effects.
If you are using an AI w/ 100 users who are writing throwaway software vs someone who is using AI w/ 1000 users who are writing software w/ formal specifications then guess which AI is going to win? The answer is plainly obvious to me but might not be to those who haven't thought about how current AIs actually work.
Nice project. Ideally it should be possible to run arbitrary graph or datalog queries to get relevant items but a hierarchical organization w/ basic content type tagging + vector similarity search is a good starting point.
It works well enough for my use cases so I don't know what these folks are looking for. I have it configured to run everything in WSL sandbox so the blast radius is limited to the VM w/ the code.
What do you mean? The original 2019 supremacy experiment was eventually simulated, as better classical methods were found, but the followups are still holding strong (for example [4] and [5]).
There was recently a series of blog posts by Dominik Hangleiter summarizing the situation: [1][2][3].
the reason people pay attention to him is that he does a good job publicizing both positive and negative results, and accurately categorizing which are bullshit
It doesn't matter how many "wunderkinds" Zuckerberg pays off to work at Meta. Meta is not an AI company so they will never produce anything of relevance in that domain.
Meta, Reddit, Twitter, they're staying around. Too much of the population has been captured in those places and is too passive and docile to seek out better options. The disruption we've seen in the last with social networks won't happen anymore. These places are so bad already that there's no reason to think the remaining people will leave for any reason.
Meta is the smallest but also the most profitable of the FAANGs in terms of percentage profit margin at 30%, vs 25% for Apple and Google and much less for Amazon and Netflix. Their position in social networks is a license to print money and unless humanity goes fully autistic all of a sudden, this is unlikely to change, technological shifts notwhitstanding.
The one thing that can kill them is the fact each successive generation avoids their fuddy-duddy parents' social network, so Boomers and Gen X are on Facebook, Millennials on Instagram and Gen Z on TikTok. If TikTok had been killed as was originally the plan, they would have benefited massively, but Trump does not trust Zuck and made sure it went to his son Barron and the Ellisons.
reply