Hacker Newsnew | past | comments | ask | show | jobs | submit | js8's commentslogin

I thought that you were about to write: "as a janitor in a restaurant, the dessert topping is sometimes used as a floor polish".

Something as expensive as dessert toppings would only be used as floor polish by the people who truly were high... and only if they could do it without the boss knowing what they were doing.

A very human thing to do is - not to tell us which model has failed like this! They are not all alike, some are, what I observe, order of magnitude better at this kind of stuff than others.

I believe how "neurotypical" (for the lack of a better word) you want model to be is a design choice. (But I also believe model traits such as sycophancy, some hallucinations or moral transgressions can be a side effect of training to be subservient. With humans it is similar, they tend to do these things when they are forced to perform.)


Codex in this case. I didn't even think about mentioning it. I'll update the post if it's actually relevant. Which I guess it is.

EDIT: It's specifically GPT-5.4 High in the Codex harness.


weird, for me it was too un-human at first, taking everything literally even if it doesn't make sense; I started being more precise with prompting, to the point where it felt like "metaprogramming in english"

claude on the other hand was exactly as described in the article


Also the exact model/version if you haven't already.

Also, there's no specific examples of what the prompt was and what the result was. Just a big nothingburger

Can you be more specific?

Trump issued an EO against "woke AI" that allows them to directly influence how models respond

https://www.lawfaremedia.org/article/evaluating-the--woke-ai...


I think they're counting on an ego hit - "you're just a tool" - although it might be negated by the human satisfaction of figuring things out.

I recently came across this presentation https://youtu.be/QxkRf-xSfgI, and it changed my view of AI quite significantly. (There is also a paper https://arxiv.org/html/2510.12066v2 .)

The fundamental idea is that "intelligence" really means trying to shorten the time to figure out something. So it's a tradeoff, not a quality. And AI agents are doing it.

Therefore, if that perspective is right, the issues that the OP describes are inherent to intelligent agents. They will try to find shortcuts, because that's what they do, it's what makes them intelligent in the first place.

People with ASD or ADHD or OCD, they are idiot-savants in the sense of that paper. They insist on search for solutions which are not easy to find, despite the common sense (aka intelligence) telling them otherwise.

It's a paradox that it is valuable to do this, but it is not smart. And it's probably why CEOs beat geniuses in the real world.


CEOs beat geniuses in the real world because they often have other pathologies, like enough moral flexibility to ignore the externalities of their profit centers.

I'd also argue there's some training bias in the performance, it's not just smart shortcuts... Claude especially seems prone to getting into a 'wrap it up' mode even when the plan is only half way completed and starts deferring rather than completing tasks.


> The fundamental idea is that "intelligence" really means trying to shorten the time to figure out something.

"Figure out" implies awareness and structured understanding. If we relax the definition too much, then puddles of water are intelligent and uncountable monkeys on typewriters are figuring out Shakespeare.



Basically the same on ChatGPT. DeepSeek managed to generate output rather than meta-discussion about how to generate output.

That's actually interesting, thanks. It's like AI is tattling on itself.

Should it be relevant though? It seems to me like criminalization of thoughts. Even if they externalized into a diary.

If you write in your diary "I'm gonna kill her" and then she gets killed it's relevant

If you were caught with notebooks detailing your plans to kill a list of people, showing that you've meticulously tracked their movements and listing locations for dumping the bodies that would be extremely relevant. I don't see how it'd be a good idea to exclude that kind of evidence.

Depends, if you wrote a detailed confession with material non public facts, a jury can hear it and weigh the evidence.

When Agile came about to company (large American corp) I work for, around 2015 (arguably quite late), I was quite skeptical. In my opinion, a decent waterfall (basically a sort of compromise) worked pretty well, and I didn't like fake "innovations" like Scrum or renaming everything in project management terminology.

Then I read Steve Yegge's Good Agile, Bad Agile. It basically says, Agile is just a Kanban queue. And I think I got it, and I think that's working very well. At least from the project management side.

There are IMHO three management angles to look at any engineering project - product, project and architecture. If you are building a house, you need a blueprint to tell where to put what concrete, you need a render (or paper model) that you show to a customer, and you need a BOM and a timeline to make the investors happy. The software is not different. But that's also where there are misunderstandings in what Agile is - the product management, project management and engineering all have different ideas what kind of "plan" is needed.

So in the case of software, specs are like the house's blueprint. In some cases, specs might be useful prototype, in some cases not. It's just not the type of plan that the project or product management cares about.

Regarding the project management angle, for me Agile today is clearly Kanban, and almost everything else is wrong or not required. I often make an analogy with computers. In the 50s and 60s, people tried to plan the work that the computer calculates by creating some scheduling algorithms that plan ahead the use of resources, avoid conflicts and such. Eventually, we found out that simple dispatch queues work the best, don't estimate at all how long the task will take. Just give it a priority, a time slice, and let it run. And I think the same applies for the SW development. And I think it's time that project management people take note from computer scientists - they already know.

Doesn't mean SW development time cannot be estimated if you need to, it's just not a very efficient to do so (it takes extra time, depending on how good estimate you want).


I would agree, it makes them anything but elementary. I am honestly not even sure if there is a finite constructible basis of the functions that can express any solution of single-variable integer polynomials.

And for multivariate polynomials, the roots are uncomputable due to MRDP theorem.


It is not known, and the model problem for this is Hilbert's 13th [1].

Nonetheless, "elementary function" is a technical term dating back to the 19th century; it's very much not a general adjective whose synonym is "basic".

[1] https://en.wikipedia.org/wiki/Hilbert%27s_thirteenth_problem


Thanks, actually https://en.wikipedia.org/wiki/Elementary_function confirms your claim.

Nevertheless, it is a horrible definition. Mathematicians have often taken care to define things as close to everyday intuition as they could (and then proving an equivalence). The "elementary function" in this definition is just a weird mix of concerns.


Elementary function is also a general English phrase that can very much be used to represent the functions on a scientific calculator.

The proof that free markets are efficient (even in the narrow sense economists use this word) relies on an assumption of perfect information. This has been known at least since Akerlof.

The Misesian folks are a lost cause, IMHO. They're hardcore rationalists, self-indulging in circular moral arguments from assumptions that don't apply in the real world.


That's what makes the insider trading argument so tantalizing--it's arguing that it helps move the market closer to perfect information. But, of course, the world is complicated and dynamic, and it tacitly depends on all kinds of assumptions and beliefs about the resulting costs and benefits. It would be nice if the debate shifted to pinning down those assumptions, quantifying them as best as possible, and then iteratively tweaking and adjusting regulatory models. But that's true of just about everything and probably too unrealistic an ask, especially at a time when one side is convinced markets are just a mechanism for unjust exploitation, and the other side is convinced regulation is what sustains inequity (to the extent inequity is something even worth caring about).

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: