Hacker Newsnew | past | comments | ask | show | jobs | submit | freeqaz's commentslogin

I have been wondering if 1 Million token context contributes here also. Compaction is much rarer now. How does that influence model performance? For some tasks I do, I feel like performance is worst now after this. Also Plan mode doesn't seem to wipe context anymore?

i beg to differ. compaction happens alot for me, and at some point the output becomes extremely nonsensical

Also a good fallback if your phone screen cracked 2 hours before. But I can imagine part of the challenge they are facing here are scalpers. TicketMaster app 'rotates' the actual ticket every 30 seconds. Can't rotate paper.

I'd think that having a 2nd factor like presenting ID that matches the ticket would be sufficient there though.


You don’t need the app itself to get the rotating tickets, the algorithm is pretty dumb and was reverse engineered back in 2024. https://news.ycombinator.com/item?id=40906148

You probably still need a device of some kind though.


128gb is the max RAM that the current Strix Halo supports with ~250GB/s of bandwidth. The Mac Studio is 256GB max and ~900GB/s of memory bandwidth. They are in different categories of performance, even price-per-dollar is worse. (~$2700 for Framework Desktop vs $7500 for Mac Studio M3 Ultra)


You can call Cerebras APIs via OpenRouter if you specify them as the provider in your request fyi. It's a bit pricier but it exists!


I used their API normally (pay per token) a few weeks ago. Their Coding Plan appears to be permanently sold out though.


If it maintains the same price (with Anthropic tends to do or undercuts themselves) then this would be 1/3rd of the price of Opus.

Edit: Yep, same price. "Pricing remains the same as Sonnet 4.5, starting at $3/$15 per million tokens."


3 is not 1/3 of 5 tho. Opus costs $5/$25


I would honestly guess that this is just a small amount of tweaking on top of the Sonnet 4.x models. It seems like providers are rarely training new 'base' models anymore. We're at a point where the gains are more from modifying the model's architecture and doing a "post" training refinement. That's what we've been seeing for the past 12-18 months, iirc.


> Claude Sonnet 4.6 was trained on a proprietary mix of publicly available information from the internet up to May 2025, non-public data from third parties, data provided by data-labeling services and paid contractors, data from Claude users who have opted in to have their data used for training, and data generated internally at Anthropic. Throughout the training process we used several data cleaning and filtering methods including deduplication and classification. ... After the pretraining process, Claude Sonnet 4.6 underwent substantial post-training and fine-tuning, with the intention of making it a helpful, honest, and harmless1 assistant.


Nope. They need to update/retrain older base models regularily. Take Programming as an example, the field evolves faster than anything else.

Stuff from last year will be outdated today.


Does anybody know when Codex is going to roll out subagent support? That has been an absolute game changer in Claude Code. It lets me run with a single session for so much longer and chip away at much more complex tasks. This was my biggest pain point when I used Codex last week.


It's already out.


Can you explain how to use it? I’ve tried asking it to do “create 3 files using multiple sub agents” and other similar wording. It never works.

Is it in the main Codex build? There doesn’t seem to be an experiment for it.

https://github.com/openai/codex/issues/2604


I've been working on decompiling Dance Central 3 with AI and it's been insane. It's an Xbox 360 game that leverages the Kinect to track your body as your dance. It's a great game, but even with an emulator, it's still dependent on the Kinect hardware which is proprietary and has limited supply.

Fortunately, a Debug build of this game was found on a dev unit (somehow), and that build does _not_ have crazy optimizations in place (Link-time Optimization) that make this feat impossible.

I am not somebody that is deep on low level assembly, but I love this game (and Rock Band 3 which uses the same engine), and I was curious to see how far I could get by building AI tools to help with this. A project of this magnitude is ... a gargantuan task. Maybe 50k hours of human effort? Could be 100k? Hard to say.

Anyway, I've been able to make significant progress by building tools for Claude Code to use and just letting Haiku rip. Honestly, it blows me away. Here is an example that is 100% decompiled now (they compile to the exact same code as in the binary the devs shipped).

https://github.com/freeqaz/dc3-decomp/blob/test-objdiff-work...

My branch has added over 1k functions now and worked on them[0]. Some is slop, but I wrote a skill that's been able to get the code quite decent with another pass. I even implemented vmx128 (custom 360-specific CPU instructions) into Ghidra and m2c to allow it to decompile more code. Blows my mind that this is possible with just hours of effort now!

Anybody else played with this?

0: https://github.com/freeqaz/dc3-decomp/tree/test-objdiff-work...


I assume that the author here is testing against one of these boxes, right? https://marketplace.nvidia.com/en-us/enterprise/personal-ai-...

Are these considered a good deal at $3-4k? What's the software support like on them? I've got 2x 3090s and I'm curious how this compares.


> https://marketplace.nvidia.com/en-us/enterprise/personal-ai-...

In Europe these cost 5k EUR. I guess I'm not buying computer ever again and hopefully the 10 years old ones I have will never die.


DGX Spark vs. Strix Halo vs. M4 Max is hotly debated. You can find plenty of HN discussions and YouTube videos about it.


I spent way too many hours writing this all today, but I wanted to get this pushed out for others to learn from. There is a ton of detail in this notes file[0] that Claude Code helped me assemble.

If anybody has any suggestions or questions, shoot! It's 4am though so I'll be back in a bit. These CVEs are quite brutal.

0: https://github.com/freeqaz/react2shell/blob/master/EXPLOIT_N...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: