I have been wondering if 1 Million token context contributes here also. Compaction is much rarer now. How does that influence model performance? For some tasks I do, I feel like performance is worst now after this. Also Plan mode doesn't seem to wipe context anymore?
Also a good fallback if your phone screen cracked 2 hours before. But I can imagine part of the challenge they are facing here are scalpers. TicketMaster app 'rotates' the actual ticket every 30 seconds. Can't rotate paper.
I'd think that having a 2nd factor like presenting ID that matches the ticket would be sufficient there though.
128gb is the max RAM that the current Strix Halo supports with ~250GB/s of bandwidth. The Mac Studio is 256GB max and ~900GB/s of memory bandwidth. They are in different categories of performance, even price-per-dollar is worse. (~$2700 for Framework Desktop vs $7500 for Mac Studio M3 Ultra)
I would honestly guess that this is just a small amount of tweaking on top of the Sonnet 4.x models. It seems like providers are rarely training new 'base' models anymore. We're at a point where the gains are more from modifying the model's architecture and doing a "post" training refinement. That's what we've been seeing for the past 12-18 months, iirc.
> Claude Sonnet 4.6 was trained on a proprietary mix of publicly available information from
the internet up to May 2025, non-public data from third parties, data provided by
data-labeling services and paid contractors, data from Claude users who have opted in to
have their data used for training, and data generated internally at Anthropic. Throughout
the training process we used several data cleaning and filtering methods including
deduplication and classification. ... After the pretraining process, Claude Sonnet 4.6 underwent substantial post-training and fine-tuning, with the intention of making it a helpful, honest, and harmless1 assistant.
Does anybody know when Codex is going to roll out subagent support? That has been an absolute game changer in Claude Code. It lets me run with a single session for so much longer and chip away at much more complex tasks. This was my biggest pain point when I used Codex last week.
I've been working on decompiling Dance Central 3 with AI and it's been insane. It's an Xbox 360 game that leverages the Kinect to track your body as your dance. It's a great game, but even with an emulator, it's still dependent on the Kinect hardware which is proprietary and has limited supply.
Fortunately, a Debug build of this game was found on a dev unit (somehow), and that build does _not_ have crazy optimizations in place (Link-time Optimization) that make this feat impossible.
I am not somebody that is deep on low level assembly, but I love this game (and Rock Band 3 which uses the same engine), and I was curious to see how far I could get by building AI tools to help with this. A project of this magnitude is ... a gargantuan task. Maybe 50k hours of human effort? Could be 100k? Hard to say.
Anyway, I've been able to make significant progress by building tools for Claude Code to use and just letting Haiku rip. Honestly, it blows me away. Here is an example that is 100% decompiled now (they compile to the exact same code as in the binary the devs shipped).
My branch has added over 1k functions now and worked on them[0]. Some is slop, but I wrote a skill that's been able to get the code quite decent with another pass. I even implemented vmx128 (custom 360-specific CPU instructions) into Ghidra and m2c to allow it to decompile more code. Blows my mind that this is possible with just hours of effort now!
I spent way too many hours writing this all today, but I wanted to get this pushed out for others to learn from. There is a ton of detail in this notes file[0] that Claude Code helped me assemble.
If anybody has any suggestions or questions, shoot! It's 4am though so I'll be back in a bit. These CVEs are quite brutal.
reply