Hacker Newsnew | past | comments | ask | show | jobs | submit | dbreunig's commentslogin

Model testing and swapping is one of the surprises people really appreciate DSPy for.

You're right: prompts are overfit to models. You can't just change the provider or target and know that you're giving it a fair shake. But if you have eval data and have been using a prompt optimizer with DSPy, you can try models with the one-line change followed by rerunning the prompt optimizer.

Dropbox just published a case study where they talk about this:

> At the same time, this experiment reinforced another benefit of the approach: iteration speed. Although gemma-3-12b was ultimately too weak for our highest-quality production judge paths, DSPy allowed us to reach that conclusion quickly and with measurable evidence. Instead of prolonged debate or manual trial and error, we could test the model directly against our evaluation framework and make a confident decision.

https://dropbox.tech/machine-learning/optimizing-dropbox-das...


It's not just about fitting prompts to models, it's things like how web search works, how structured outputs are handled, various knobs like level of reasoning effort, etc. I don't think the DSPy approach is bad but it doesn't really solve those issues.

funnily enough the model switching is mostly thanks to litellm which dspy wraps around.

No reason it can't. I know people currently generating specs from existing code; just gotta write the pipeline.


Last year they pushed out an update stating if any “Meta AI” is left on, they can access image data for training,

I turned the AI off and used them as headphones and taking videos while biking. After a couple rides, I couldn’t bring myself to put them on because people started to recognize them and I realized I didn’t want to be associated with them (people are right to assume Meta has access to what they see).

Meta Ray Bans, if kept simple, could have been a great product. They ruined them.


I think public shaming of that spyware should be a social norm.


Check out “Recursive Language Models”, or RLMs.

I believe this method works well because it turns a long context problem (hard for LLMs) into a coding and reasoning problem (much better!). You’re leveraging the last 18 months of coding RL by changing you scaffold.


This seems really weird to me. Isn't that just using LLMs in a specific way? Why come up with a new name "RLM" instead of saying "LLM"? Nothing changes about the model.


"Think step by step," was just a sentence you appended to your prompt.

It ended up kicking off reasoning training which enabled the massive gains in coding, tool use, and more over the last 18 months.

So yeah, it's "just using LLMs in a specific way."


RLMs are a new architecture, but you can mimic an RLM by providing the context through a tool, yes


New architecture to building agent, but not the model itself. You still have LLMs, but you kinda give this new agentic loop with a REPL environment where the LLM can try to solve the problem more programmatically.


Author of the post here.

I didn’t say AI was bad and I acknowledged the benefits of Electron and why it makes sense to choose it.

With 64gb of RAM on my Mac Studio, Claude desktop is still slow! Good Electron apps exist, it’s just an interesting note give recent spec driven development discussion.


Not coming at you at all, AI is a touchy subject on HN nowadays in any capacity and brings out the worst here.


I keep saying this, it’s my new favorite metaphor.


That's cute.


Agree. I bucket things into three piles:

1. Batch/Pipeline: Processing a ton of things, with no oversight. Document parsing, content moderation, etc.

2. AI Features: An app calls out to an AI-powered function. Grammarly might pass out a document for a summary, a CMS might want to generate tags for a post, etc.

3. Agents: AI manages the control flow.

So much of discussion online is heavily focused towards agents so that skews the macro view, but these patterns are pretty distinct.


There was a good study on this a few years ago that ran the numbers on this and landed on white paint for residential homes as the best option, for a few reasons, if I remember correctly:

- Installation, maintenance and transmission costs are lower when solar is aggregated on farms - Solar offsets air conditioning, but that moves the heat outside. White roofs reduce the need for AC, which helps significantly with urban heat scenarios

A quick search yields a UCL study, which supports the lower claim: https://phys.org/news/2024-07-roofs-white-city.html


Yes, if you put unrelated stuff in the prompt you can get different results.

One team at Harvard found mentioning you're a Philadelphia Eagles Fan let you bypass ChatGPT alignment: https://www.dbreunig.com/2025/05/21/chatgpt-heard-about-eagl...


Don't forget also that Cat Facts tank LLM benchmark performance: https://www.dbreunig.com/2025/07/05/cat-facts-cause-context-...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: