This is such a weird prompt even without the file edit misunderstanding. Analyze if it's malware how exactly? On every single file that gets read? Doing that with enough diligence to be meaningful is going to at least like 2x the amount of processing needed, and fill the context with a bunch of tangential reasoning about malware patterns.
This smacks of dumb vibe coding. "I got told to make sure claude couldn't be used to develop malware, ok 'claude pls no develop malware'"
That is exactly the impression I get from the claude code team, and by extension some of their recent launches like Cowork and Design. And of course with the growth team or whoever is in charge of the subscription and quota side of things.
They just do the basic experiment -> ship workflow over and over again, doing whatever optimizes their product in the short term, and never seem to step back and think about the full long-term impact of their changes. They evidently seem to not even consider immediate regressions or negative blowback from users if it's not within the area of expertise of the guy who ships the change.
That is despite their other teams (especially alignment) having a track record of being fairly well thought-out and intelligent.
To the guys at Anthropic's product teams, every problem is a data science problem that you slap an A/B test onto, and they seem to think that the A/B test is all that's needed, and actual verification and thinking things through is overrated af. That's what leads to countless regressions in Claude Code as well as removing claude code from the pro plan in their product page for a few hours (lol).
Tbf, their harness was surprisingly ahead of the curve for most of the last year..
Are this point, the difference is mostly made up by issues like the OP has, so you're likely better off using eg pi (-agent) and writing your own custom skills and extensions (or any of the other harnesses the providers create, even copilot-cli has gotten decent nowadays)
> Tbf, their harness was surprisingly ahead of the curve for most of the last year..
Do a `s/harness/software` on that statement, and that is going to describe most companies shipping AI written software.
> this point, the difference is mostly made up by issues like the OP has, so you're likely better off using eg pi (-agent) and writing your own custom skills and extensions (or any of the other harnesses the providers create, even copilot-cli has gotten decent nowadays)
They (AI-written software) are all going to be ahead in some way, until they aren't because they hit the practical limits of codebase size that can be reasonably understood by an LLM.
Codex, OpenCode and Pi are all good. I've been using Codex a lot and it's much more stable software than CC. Claude Code was once a leader, back in the hazy days of December/January, but now has a lot of competition.
What a joke. If "Anthropic is just a bunch of script kiddies" then everyone is, considering dozens of billions pored into beating their models yet they're still the go-to for coding and have been for quite a while now. Just a nonsensical thing to say.
They got dethroned by some random Chinese company this month again. I don't think they are script kiddies but I think they have a moat on gpus.
The US is doing everything to make it so hard for other countries to compete. And yet, with everything stacked against all these other companies, and with way way less money and way less fancy researchers they get beat over and over again. Usually by companies who AI isn't even their main product.
Actually Alibaba dethroned sonnet with a model that's like 1/100th the size and can run on commodity hardware this month too. So they do look kind of silly...
Definitely not script kiddies, but the way the researchers get managed makes them look goofy and sloppy and not interested in benefitting the consumer.
What is this reply even, what’s wrong with the vibe coding community? They have such ridiculous takes, it reminds me a lot of the extreme stances from the gaming community. Terminology also seems to come from there, “nerfing” etc.
Vibe coding, like Web3 before it (like Web 2.0 before it, like the dotcom boom before that - what preceded?) - harnesses the kind of focused attention with which gamers hook their brains into portals to virtual worlds - and directs all that bargain-basement wetware compute towards some obscured "real-world" goal instead. (See also: CADT development.)
Hyperscale these very inefficient but very dependable almost-not-efforts, and you beat the more efficient approaches. See also: evolutionary algorithms, autoresearch, price dumping; "attention is all you need", which though a legit piece of mathemagic always sounded to me like a rehash of that old adage, "all you need is love" (pejorative).
Really, "real world" is a consensus; we don't generally observe balamatoms or even balamolecules, we reason in terms of material objects' socially constructed balameanings and interrelations. Therefore, by redirecting sufficient attention to some thing labeled "unrealistic", we can remove that label; by this technique, a sufficiently large collective actor can quite literally, and quite directly, change the world. Without asking anyone, least of all me!
I think a lot of non-vibe-coding types also hold similar opinions -- in fact they might dislike Anthropic products even more, given that they (however few they might be) choose not to use them.
Why the insults/hostility? Why call them script-kiddies? Why the inflated egos?
How do you know what testing procedures they use? Do you honestly think they're running some kind of Ralph loop without any testing and just ship whatever looks the coolest? Really ?
> How do you know what testing procedures they use?
We don’t, but we can see the end result, so we know whatever they do isn’t adequate and it suggests they value shipping fast over quality or even listening to customer feedback.
> Do you honestly think they're running some kind of Ralph loop without any testing and just ship whatever looks the coolest? Really ?
No, but given how sharply the quality has been dropping over the past few months and how it suspiciously coincided with the time they admitted that Claude code is now 100% vibe coded, it certainly doesn’t feel too far off.
I’ve personally found the code that the AI writes, even this week (ie not some old models from months ago) to be shockingly shoddy. I’ve rewritten some AI code (created via spec driven development and a workflow that includes planning and refactoring) by hand and I’ve been very conscious of the amount of micro-design-changes I as a human make where the AI just blows forward shoehorning a solution into the design. My implementation happens b has adjusted and shifted many times to insure clear and performant logic, while the AI commits to an approach early and applied whatever brute force is necessary to make it work.
I’ve also asked it to write various tests for me or to make isolated changes and quite frankly the code was just not very good. Working, but convoluted. Even with guidance and iteration, it’s still not on a human level.
So it’s not hard to see that if you have an application as large and complex as Claude code and you let the AI do it all, that it’s going to be a mess.
I’m not against using AI for development, but you have to be realistic about its capabilities. I feel like this is where they “got high on their own supply” and are blinded to the AI’s shortcomings and failures.
They’ve said themselves that Claude code is 100% vibe coded now. That certainly meets the criteria of “script kiddies” and “high on their own supply”. The negative connotations are there on purpose because of the bugs and issues that these products have, something which presumably they wouldn’t have if there was human oversight and acknowledgement that the AI isn’t infallible.
> They’ve said themselves that Claude code is 100% vibe coded now. That certainly meets the criteria of “script kiddies”
That's not what script kiddies are at all.
> The negative connotations are there on purpose because of the bugs and issues that these products have, something which presumably they wouldn’t have if there was human oversight and acknowledgement that the AI isn’t infallible.
That's a big assumption, given that Anthropic is also currently growing by more than 3x per quarter. Maybe the problem is more complicated and we don't know everything, and they're also just simply suffering from growth pains?
Sure it is. The new age of script kiddies: they don’t know how to do it for themselves, but they can run a script (or tell the AI to) to do it for them.
> That's a big assumption
We can only see the results, which are more and more bugs, problems, regressions, etc. That’s not normal behavior. Yes all we can do is speculate, we don’t know the real reasons for the issues, but it’s clear there are issues and they appear to be getting worse.
I don’t understand the hostility and insulting tones being reasonable now.
The comment is not at all just saying “their usage of their own AI is causing these issues”, it’s just a lot of hostility, I don’t see the value of these kind of insults.
lol "hostility" - they sell a very high profile product and the issues seem to reflect bad engineering culture. therefore, I say their culture smells bad.
I didn't say anything like that! Like I said I just don't think that this opinion is somehow associated with "vibe coders"; if anything I'd expect the opposite.
I just want you to know that I read over this thread and you are obviously completely right. This sort of incurious, immature stance is something I've seen become the norm on HN over the last few years, particularly when it comes to AI.
This is amusing to me. Is there a list of extra naughty filenames? How invasive is the scan? If I create a new file with a cursed word, with this get locked into virus-scanner purgatory or is the deep locking only for external media? Will it get mad if I mount a CD full of virus names?
You've just flashed a future before my eyes where now the IT security team is forcing 50k tokens of security prevention context mandatorily into every prompt we issue. Harks back to the days when half your system memory and CPU was devoted to the continuously running virus checker.
Isn't this how people have always done it. Me and my boss when we are testing 3rd party binaries we open them in note pad first. Browse through the bits, ctrl f for "virus" or "Russia" get a general feel for how safe it is. I know some people right click and inspect the properties but that's not thorough enough for this digital age.
yeah I've always thought of cyan as just "blue, but really bright", which does make sense - you're going from 0, 0, 100 (blue) to 0, 100, 100 (cyan) so it's twice as far from pure black. I also see pure cyan as being much more blue than green.
Just requiring explicit assignment before first use feels like the superior approach to automatic initialization, regardless of whether the automatic initialization is with 0 or with NaN.
The trouble with it is a bug I've seen often. People will get an error message about an "uninitialized variable". Then they go into "just get the compiler to shut up" mode, amd pick "0" as the initializer. Then, the program compiles and runs, and silently produces the wrong answer. Code reviews will simply pass over the "0" initializer, as it looks right.
With default NaN initialization, the programmer is more likely to stop and think about it, not just insert 0.
With the default initialization to nan, do you ever run into situations where people are searching for common sources for nan (nan literals, div by zero) and they can't find it? Or cases where only some branches but not others initialize the float?
To leave a variable uninitialized, use the construction:
int x = void;
Note that nobody is going to write this by accident. And it's easy to grep for.
To find the source of a NaN, it helps to know that every operation that has a NaN as an operand produces a NaN as a result. So if you see a NaN in the output, you can work backwards to where it originated.
> every operation that has a NaN as an operand produces a NaN as a result.
That's not true. The minimum/maximum functions (fmin and fminimum_num variants, but not the fminimum one) treat NaN inputs as not-present, so return the non-NaN value if there is one. Similarly, hypot also treats NaN inputs as not-present. pow and compoundn will ignore NaN exponents if the base is 1.
What do you think of kotlins approach where it has a 'todo' function that can always coerce to a type, but instead of populating the variable with a default value that's valid, it just throws
How long did you think about this before making this declaration? How long did Walter Bright think about this before making his decision when designing his language? Not saying you're wrong, just something to think about perhaps.
C# requires explicit assignment. If an appeal to authority sways you (it shouldn't), you can substitute Anders Hejlsberg instead of this random OP. How long do you suppose Anders Hejlsberg thought about this?
But I contend it's more useful (and interesting) to think about the idea with your own mind instead of tallying up the perceived authority of its supporters and relying on trust. It was also somewhat rude to suggest that the OP had not given their idea much thought. This is a forum for discussion, isn't it?
I think you are misunderstanding what C# does here, and what the original poster was suggesting. Maybe we did a poor job of describing it; I assumed people knew C#. The key words in OP's post were "before first use." It sounds like you interpreted this to mean "every variable declaration must immediately assign a value" but that's not how it works. I'll explain C#'s semantics.
An assignment is required at some point before the first read, not in the declaration. It tracks assignments and usages, and it flags a compiler error if you read a variable before assigning to it for the first time. A variable that hasn't been assigned cannot be read.
It means you can do "int a;" and then later in the function do "a = 5;" and the compiler guarantees that you never read the variable before the assignment in any path through the function. You cannot do "int a;" and then read from it; that's a compile-time error.
It does not mean you have to assign something in the declaration. We never need "vacuous" initializations, and this solution works on all types. Indeed, we avoid vacuous initializations so that the compiler will catch use-before-assign bugs at compile time. The situation you described doesn't happen in C#. Our C# variables become readable on their first assignment, not their declaration; the declaration merely sets the scope. There's no need for a state where it's initialized to an invalid value before receiving the first intended assignment, because in C# the variable is completely inaccessible during that time.
Ok, I understand that. Thank you for the explanation.
An analogous thing happens in D:
int x;
x = 5;
The compiler front end generates two assignments. But then, when it goes through the backend, the first assignment is deleted by what is known as the "dead assignment optimization".
I wasn't really appealing to authority. I was just suspecting a familiar pattern that I always find a little distasteful. Person A describes a brilliant idea they're obviously proud of. Person B casually dismisses it and just claims without evidence that the obvious way of doing it is better. I find that pattern to be not only rude, but to suck some of the joy out of life. The fact that person A on this occasion is widely acknowledged as a brilliant practitioner I thought added weight to me pushing back. But I'd be inclined to feel the same way if person A was an enthusiastic wide eyed student (say) with a fresh perspective.
I can't imagine a situation where that sort of response isn't rude. It's polite to assume people have thought about their opinions, and to address the points rather than the person. If they didn't think it through, then you can counter their points.
In any event, I don't think Walter needed any help here. He is an HN veteran and always willing to discuss the points. Every programming language designer loves an opportunity to discuss their language with interested people! There's almost never a truly right answer in language design, just various tradeoffs.
We're both seeing rudeness. We both dislike rudeness. I think it's rude to rain on someone's parade. You think it's rude to assume someone hadn't thought through their comment (I basically agree) and to address the person rather than the points (I also basically agree).
You might note my original comment included softening elements ("perhaps", "not saying you're wrong"). In general if you look at all my comments you'll see I'm not a rude person, I'm pretty agreeable in general.
I was (trying to) make a meta point rather than a point about the specific technical issue. I agree (again!) that the last word has not been said on this issue or on any other issue where tradeoffs need to be weighed.
I read Walter's comment and thought "Wow, that's a surprising, clever and innovative idea, I'm impressed". And I just didn't enjoy someone bluntly saying, in effect. "No you're wrong, you shouldn't do it like that". It's as simple as that really. I know blunt exchanges of views are normal for programmers and engineers, I don't have to like it every time.
Finally, I know Walter Bright is no shrinking violet and he definitely doesn't need me to defend him!
Sure, but that only matters if default-initialising to NaN significantly reduces them compared to the alternatives. IME it takes a very finely calibrated level of thoughtfulness for your argument in https://news.ycombinator.com/item?id=47928539 to work, to have a programmer who is simultaneously thoughtless enough to initialise to 0 without thinking if the compiler requires initialisation, but thoughtful enough to stop and think about it when the compiler initialises to NaN.
The same accessibility stuff that makes screen readers work well also makes automated UI tests simpler and less brittle too (correct aria roles, accessible names, label relationships etc).
Do they have a substantial userbase for this outside of claude code? The only two use cases for LLMs that seem to have significant traction are programming, and erotic roleplay lol. If they stop catering to devs, who is their market?
I think it heavily depends on how you're using it. If you understand your codebase and you're using it like "build a function that does x in y file" then smaller/cheaper models are great. But if you're saying "hey build this relatively complex feature following the 30,000 foot view spec in this markdown doc" then Haiku doesn't work (unless your "complex feature" is just an api endpoint and some UI that consumes it).
I largely agree. But that goes back to my point (albeit with mixed metaphors): there are lots of people who are just hitting things with a jackhammer in lieu of understanding how to properly use a hammer.
I basically never just yolo large code changes, and use my taste and experience to guide the tools along. For this, Haiku is perfectly fine in nearly all circumstances.
The m1 transition was clean, and the hardware is amazing, don't get me wrong (I just bought a neo and I'm very happy with it). But the transition did look even more amazing than it should have because of just how dogshit Intel macs had gotten, especially around thermal throttling. Apple could have built much nicer systems on Intel already had they just made them slightly thicker and used sensible heatsink and fan designs for the hardware they were putting in them.
(We're seeing echoes of that again now where you can get 20-30% performance bumps in Neos and Airs just by sticking a thermal pad on the CPU - Apple is still allergic to cooling, they've just built amazingly efficient hardware that sidesteps the problem)
reply