Hacker Newsnew | past | comments | ask | show | jobs | submit | idopmstuff's commentslogin

I am watching DTF St. Louis (which is not a terrible reality show about a third tier city like the title implies, actually a Jason Bateman kind of dark comedy/whodunit), and Peyronie's features in the story. The show also has the first commercials I've ever seen for a Peyronie's treatment, and apparently it's an official ad partner of the show. I wonder if some enterprising show exec decided to go pitch the perfect sponsorship or if the company making a treatment commissioned a show...

A cure for Peronies which is non invasive, works on chronic/non-chronics is worth countless billions.

DTF STL is a fantastic show as are all of the series done by Steven Conrad.

As a former PM, I will say that if you want to stop something from happening at your company, the best route is to come off very positive about it initially. This is critical because it gives you credibility. After my first few years of PMing, I developed a reflex that any time I heard a deeply stupid proposal, I would enthusiastically ask if I could take the lead on scoping it out.

I would do the initial research/planning/etc. mostly honestly and fairly. I'd find the positives, build a real roadmap and lead meetings where I'd work to get people onboard.

Then I'd find the fatal flaw. "Even though I'm very excited about this, as you know, dear leadership, I have to be realistic that in order to do this, we'd need many more resources than the initial plan because of these devastating unexpected things I have discovered! Drat!"

I would then propose options. Usually three, which are: Continue with the full scope but expand the resources (knowing full well that the additional resources required cannot be spared), drastically cut scope and proceed, or shelve it until some specific thing changes. You want to give the specific thing because that makes them feel like there's a good, concrete reason to wait and you're not just punting for vague, hand-wavy reasons.

Then the thing that we were waiting on happens, and I forget to mention it. Leadership's excited about something else by that point anyway, so we never revisit dumb project again.

Some specific thoughts for you:

1. Treat their arguments seriously. If they're handwaving your arguments away, don't respond by handwaving their arguments away, even if you think they're dumb. Even if they don't fully grasp what they're talking about, you can at least concede that agents and models will improve and that will help with some issues in the future.

2. Having conceded that, they're now more likely to listen to you when you tell them that while it's definitely important to think about a future where agents are better, you've got to deal with the codebase right now.

3. Put the problems in terms they'll understand. They see the agent that wrote this feature really quickly, which is good. You need to pull up the tickets that the senior developers on the team had to spend time on to fix the code that the agent wrote. Give the tradeoff - what new features were those developers not working on because they were spending time here?

4. This all works better if you can position yourself as the AI expert. I'd try to pitch a project of creating internal evals for the stuff that matters in your org to try with new models when they come out. If you've volunteered to take something like that on and can give them the honest take that GPT-5.5 is good at X but terrible at Y, they're probably going to listen to that much more than if they feel like you're reflexively against AI.


It's even better when you guide them into finding the fatal flaw for themselves.

Hahaha yes this is absolutely true but often times so much more work.

As a (sometime) TPM, you are the kind of PM I've been looking for.

Hah, thanks but unfortunately I quit and started a business a couple of years ago, in no small part because I didn't want to spend my time maneuvering to kill stupid ideas.

Very well said. So many engineers balk at "coming off as positive" as a form of lying or as a pointless social ritual, but it's the only thing that gets you a seat at the table. Engineers who say "no" or "that's stupid" are never seen as leaders by management, even if they're right. The approach you laid out here is how you have _real_ impact as an engineering leader, because you keep getting a seat at the table to steer what actually happens.

I don't know that people are saying code is dead (or at least the ones who have even a vague understanding of AI's role) - more that humans are moving up a level of abstraction in their inputs. Rather than writing code, they can write specs in English and have AI write the code, much in the same way that humans moved from writing assembly to writing higher-level code.

But of course writing code directly will always maintain the benefit of specificity. If you want to write instructions to a computer that are completely unambiguous, code will always be more useful than English. There are probably a lot of cases where you could write an instruction unambiguously in English, but it'd end up being much longer because English is much less precise than any coding language.

I think we'll see the same in photo and video editing as AI gets better at that. If I need to make a change to a photo, I'll be able to ask a computer, and it'll be able to do it. But if I need the change to be pixel-perfect, it'll be much more efficient to just do it in Photoshop than to describe the change in English.

But much like with photo editing, there'll be a lot of cases where you just don't need a high enough level of specificity to use a coding language. I build tools for myself using AI, and as long as they do what I expect them to do, they're fine. Code's probably not the best, but that just doesn't matter for my case.

(There are of course also issues of code quality, tech debt, etc., but I think that as AI gets better and better over the next few years, it'll be able to write reliable, secure, production-grade code better than humans anyway.)


> But of course writing code directly will always maintain the benefit of specificity. If you want to write instructions to a computer that are completely unambiguous, code will always be more useful than English.

Unless the defect rate for humans is greater than LLMs at some point. A lot of claims are being made about hallucinations that seem to ignore that all software is extremely buggy. I can't use my phone without encountering a few bugs every day.


Yeah, I don't really accept the argument that AI makes mistakes and therefore cannot be trusted to write production code (in general, at least - obviously depends on the types of mistakes, which code, etc.).

The reality is we have built complex organizational structures around the fact that humans also make mistakes, and there's no real reason you can't use the same structures for AI. You have someone write the code, then someone does code review, then someone QAs it.

Even after it goes out to production, you have a customer support team and a process for them to file bug tickets. You have customer success managers to smooth over the relationships with things go wrong. In really bad cases, you've got the CEO getting on a plane to go take the important customer out for drinks.

I've worked at startups that made a conscious decision to choose speed of development over quality. Whether or not it was the right decision is arguable, but the reality is they did so knowing that meant customers would encounter bugs. A couple of those startups are valuable at multiple billions of dollars now. Bugs just aren't the end of the world (again, most cases - I worked on B2B SaaS, not medical devices or what have you).


> humans also make mistakes

This is broadly true, but not comparable when you get into any detail. The mistakes current frontier models make are more frequent, more confident, less predictable, and much less consistent than mistakes from any human I'd work with.

IME, all of the QA measures you mention are more difficult and less reliable than understanding things properly and writing correct code from the beginning. For critical production systems, mediocre code has significant negative value to me compared to a fresh start.

There are plenty of net-positive uses for AI. Throwaway prototyping, certain boilerplate migration tasks, or anything that you can easily add automated deterministic checks for that fully covers all of the behavior you care about. Most production systems are complicated enough that those QA techniques are insufficient to determine the code has the properties you need.


> The mistakes current frontier models make are more frequent, more confident, less predictable, and much less consistent than mistakes from any human I'd work with.

my experience literal 180 degrees from this statement. and you don’t normally get the choose humans you work with, some you may be involved in the interview process but that doesn’t tell you much. I have seen so much human-written code in my career that, in the right hands, I’ll take (especially latest frontier) LLM written code over average human code any day of the week and twice on Sunday


Humans also make mistakes, but unlike LLMs, they are capable of learning from their mistake and will not repeat it once they have learned. That, not the capacity to make mistakes, is why you should not allow LLMs to do things.

Developers repeat the same mistakes all the time. Otherwise off by one wouldn’t be a thing.

most human bugs are caused by failures in reasoning though, not by just making something up to leap to the conclusion considered most probable, so not sure if the comparison makes sense.

The end result is the same either way, as is the resolution.

> most human bugs are caused by failures in reasoning though

Citation needed.


sorry, that is just taken from my experience, and perhaps I am considering reasoning to be a broader category than others might.

To be lenient I will separate out bugs caused by insufficient knowledge as not being failures in reasoning, do you have forms of bugs that you think are more common and are not arguably failures in reasoning that should be considered?

on edit: insufficient knowledge that I might not expect a competent developer to have is not a failure in reasoning, but a bug caused by insufficient knowledge that I would expect a competent developer in the problem space to have is a failure in reasoning, in my opinion on things.


I am currently using a Claude skill that I have been building out over the last few days that runs through my Amazon PPC campaigns and does a full audit. Suggestions of bid adjustments, new search terms and products to advertise against and adjustment to campaign structures. It goes through all of the analytics Amazon provides, which are surprisingly extensive, to find every search term where my product shows up, gets added to cart and purchased.

It's the kind of thing that would be hours of tedious work, then even more time to actually make all the changes to the account. Instead I just say "yeah do all of that" and it is done. Magic stuff. Thousands of lines of Python to hit the Amazon APIs that I've never even looked at.


And it doesn't freak you out that you're relying on thousands of lines of code that you've never looked at? How do you verify the end result?

I wouldn't trust thousands of lines of code from one of my co-workers without testing


> And it doesn't freak you out that you're relying on thousands of lines of code that you've never looked at?

I was a product manager for 15 years. I helped sell products to customers who paid thousands or millions of dollars for them. I never looked at the code. Customers never looked at the code. The overwhelming majority of people in the world are constantly relying on code they've never looked at. It's mostly fine.

> How do you verify the end result?

That's the better question, and the answer is a few things. First, when it makes changes to my ad accounts, I spot check them in the UI. Second, I look at ad reporting pretty often, since it's a core part of running my business. If there were suddenly some enormous spike in spend, it wouldn't take me long to catch it.


It's thousands of lines of variation on my own hand-tooling, run through tests I designed, automated by the sort of onboarding docs I should have been writing years ago.


I've been doing agentic work for companies for the past year and first of all, error rates have dropped to 1-2% with the leading Q3 and Q4 models... 2026's Q1 models blowing those out the water and being cheaper in some way

but second of all, even when error rates were 20%, the time savings still meant A Viable Business. a much more viable business actually, a scarily crazy viable business with many annoyed customers getting slop of some sort, with a human in the loop correcting things from the LLM before it went out to consumers

agentic LLM coders are better than your co-workers. they can also write tests. they can do stress testing, load testing, end to end testing, and in my experience that's not even what course corrects LLMs that well, so we shouldn't even be trying to replicate processes made for humans with them. like a human, the LLM is prone to just correct the test as the test uses a deprecated assumption as opposed to product changes breaking a test to reveal a regression.

in my experience, type errors, compiler errors, logs on deployment and database entries have made the LLM correct its approach more than tests. Devops and Data science, more than QA.


Why wouldn't you test? That sounds like a bad thing.

Me? I use AI to write tests just as I use it to write everything else. I pay a lot of attention to what's being done including code quality but I am no more insecure about trusting those thousands of tested lines than I am about trusting the byte code generated from the 'strings of code'.

We have just moved up another level of abstraction, as we have done many times before. It will take time to perfect but it's already amazing.


So people don't look at the code, or the tests.

So they don't know if it has the right behavior to begin with, or even if the tests are testing the right behavior.

This is what people are talking about. This is why nobody responsible wants to uberscale a serious app this way. It's ridiculous to see so much hype in this thread, people claiming they've built entire businesses without looking at any code. Keep your business away from me, then.


Do you trust the assembly your compiler puts out? The machine code your assembler puts out? The virtual machine it runs on? Thousands of lines of code you've never looked at...


None of that is generated by an LLM prone to hallucination and is perfectly deterministic unless there's a hardware problem.

And yes, I have occasionally run into compiler bugs in my career. That's one reason we test.


> None of that is generated by an LLM

How did you verify that?

> prone to hallucination

You know humans can hallucinate?

> is perfectly deterministic

We agree then that you can verify, test, and trust the deterministic code an LLM produces without ever looking at it.

> That's one reason we test

That's one way we can trust and verify code produced by an LLM. You can't stop doing all the other things that aren't coding.

I get there's a difference. Shitty code can be produced by LLMs or humans. LLMs really can pump out the shitty code. I just think the argument that you cant trust code you haven't viewed is not a good argument. I very much trust a lot of code I've never seen, and yes I've been bitten by it too.

Not trying to be an ass, more trying to figure out how im going to deal for the next decade before retirement age. Uts going to be a lot of testing and verification I guess


> How did you verify that?

The compiler works without an internet connection and requires too little resources to be secretly running a local model. (Also, you can’t inspect the source code.)

> You know humans can hallucinate?

We are talking about compilers…

> We agree then that you can verify, test, and trust the deterministic code an LLM produces without ever looking at it.

Unlike a compiler, an LLM does not produce code in a deterministic way, so it’s not guaranteed to do what the input tells it to.


It is for me because the LLM makes my ability to evaluate super, too.


Compiler theory and implementation is based on mathematical and logic principles. And hence much more provable and trustworthy than a LLM thats stitching together pieces of text based on ‘training’


"Trust"? God no. That's why I have a debugger


Also you really do have to know how the underlying assembly integer operations work or you can get yourself into a world of hurt. Do they not still teach that in CS classes?


It's also worth nothing that the "our" in that sentence is just SWEs, who are a pretty small group in the grand scheme of things. I recognize that's a lot of HN, but still bears considering in terms of the broader impact outside of that group.

I'm a small business owner, and AI has drastically increased my agency. I can do so much more - I've built so many internal tools and automated so many processes that allow me to spend my time on things I care about (both within the business but also spending time with my kids).

It is, fortunately, and unfortunately, the nature of a lot of technology to disempower some people while making lives better for others. The internet disempowered librarians.


> It's also worth nothing that the "our" in that sentence is just SWEs

It isn't, it just a matter of seeing ahead of the curve. Delegating stuff to AI and agents by necessity leads to atrophy of skills that are being delegated. Using AI to write code leads to reduced capability to write code (among people). Using AI for decision-making reduces capability for making decisions. Using AI for math reduces capability for doing math. Using AI to formulate opinions reduces capability to formulate opinions. Using AI to write summaries reduces capability to summarize. And so on. And, by nature, less capability means less agency.

Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them

Not to mention utilizing AI for control, spying, invigilation and coercion. Do I need to explain how control is opposed to agency?


I'll grant that it does extend beyond SWEs, but whether AI atrophies skills is entirely up to the user.

I used to use a bookkeeper, but I got Claude a QuickBooks API key and have had it doing my books since then. I give it the same inputs and it generates all the various journal entries, etc. that I need. The difference between using it and my bookkeeper is I can ask it all kinds of questions about why it's doing things and how bookkeeping conventions work. It's much better at explaining than my bookkeeper and also doesn't charge me by the hour to answer. I've learned more about bookkeeping in the past month than in my entire life prior - very much the opposite of skill atrophy.

Claude does a bunch of low-skill tasks in my business, like copying numbers from reports into different systems into a centralized Google Sheet. My muscle memory at running reports and pulling out the info I want has certainly atrophied, but who cares? It was a skill I used because I needed the outcome, not because the skill was useful.

You say that using AI reduces all these skills as though that's an unavoidable outcome over which people have no control, but it's not. You can mindlessly hand tasks off to AI, or you can engage with it as an expert and learn something. In many cases the former is fine. Before AI ever existed, you saw the same thing as people progressed in their careers. The investment banking analyst gets promoted a few times and suddenly her skill at making slide decks has atrophied, because she's delegating that to analysts. That's a desirable outcome, not a tragedy.

Less capability doesn't necessarily mean less agency. If you choose to delegate a task you don't want to do so you can focus on other things, then you are becoming less capable at that skill precisely because you are exercising agency.

Now in fairness I get that I am very lucky in that I have full control of when and how I use AI, while others are going to be forced to use it in order to keep up with peers. But that's the way technology has always been - people who decided they didn't want to move from a typewriter to a word processor couldn't keep up and got left behind. The world changes, and we're forced to adapt to it. You can't go back, but within the current technological paradigm there remains plenty of agency to be had.


> but whether AI atrophies skills is entirely up to the user

Thing with society is that we cannot simply rely on self-discipline and self-control of individuals. For the same reason we have universal and legally enforced education system. We would still live in mostly illiterate society if people were not forced to learn or not forced to send their children to school.

Analogies to past inventions are limited due to the fact that AI doesn't automate physical-labor, hard or light - it automates, or at least its overlords claim it automates, lot of cognitive and creative labor. Thinking itself, at least in some of its aspects.

From sociological and political perspective there is a huge difference between majority of population losing capability to forge swords or sew dresses by hand and capability to formulate coherent opinions and communicate them.


I use Claude code in a number of different parts of my business - coding internal applications, acting as a direct interface to SaaS via APIs and just general internal use.

I find there is a virtuous cycle here where the more I use it, the more helpful it is. I fired my bookkeeper and have been using Claude with a QBO API key instead, and because it already had that context (along with other related business context), when I gave it the tax docs I gave to my CPA for 2024's taxes plus my return, and asked it to find mistakes, it determined that he did not depreciate goodwill from an acquisition. CPA confirmed this was his error and is amending my return.

Then I thought it'd be fun to see how it would do on constructing my 2024 return just from the same source docs my CPA had. First time I did it, it worked for an hour then said it had generated the return, checked it against the 2024 numbers and found they're the same. I had removed the 2024 before having it do this to avoid poisoning the context with the answers, but it turns out it had a worksheet .md file that it was using on prior questions that I had not erased (and then it admitted that it had started from the correct numbers).

In order to make sure I wouldn't have that issue again, I tried the 2024 return again, completely devoid of any historical context in a folder totally outside of my usual Claude Code folder tree. It actually got my return almost entirely correct, but it missed the very same deduction that it had caught my CPA missing earlier.

So for me, the buildup of context over time is fantastic and really leads to better results.


> it can price compare and then ask the user for confirmation

Sure, but that's explicitly not what the Citrini article said. It said: "The part that should have unsettled investors more than it did was that these agents didn’t wait to be asked. They ran in the background according to the user’s preferences. Commerce stopped being a series of discrete human decisions and became a continuous optimization process, running 24/7 on behalf of every connected consumer."


There are people already doing that today. Why do you think it will not increase in usage?

That's sort of besides the point. Both of you claim an extreme, the truth is in between.


I do think the models themselves will get commoditized, but I've come around to the opinion that there's still plenty of moat to be had.

On the user side, memory and context, especially as continual learning is developed, is pretty valuable. I use Claude Code to help run a lot of parts of my business, and it has so much context about what I do and the different products I sell that it would be annoying to switch at this point. I just used it to help me close my books for the year, and the fact that it was looking at my QuickBooks transactions with an understanding of my business definitely saved me a lot of time explaining.

On the enterprise side, I think businesses are going to be hesitant to swap models in and out, especially when they're used for core product functionality. It's annoying to change deterministic software, and switching probabilistic models seems much more fraught.


If you embezzled money at your last company, I shouldn't be able to decline to hire you on my finance team on that basis?


In many sane countries, companies can ask you to provide a legal certificate that you did not commit X category of crime. This certificate will then either say that you did not do any crimes in that category, or it will say that you did commit one or more of them. The exact crimes aren't mentioned.

Coincidentally these same countries tend to have a much much lower recidivism rate than other countries.


This doesn't seem better?

I'm an employer and I want to make sure you haven't committed any serious crimes, so I ask for a certificate saying you haven't committed violent crimes. I get a certificate saying you have. It was a fistfight from a couple of decades ago when you were 20, but I don't know if it's that or if you tortured someone to death. Gotta take a pass on hiring you, sorry.

Seems like the people this benefits relative to a system in which a company can find out the specific charges you were convicted of would be the people who have committed the most heinous crimes in a given category.


At least where I live, a fistfight from decades ago wouldn’t be on the certificate. In your example you want to know about serious crimes, but ask for violent crimes, why are you surprised that the answer you get won’t be useful to make a decision?

As in many things judicial, it only works if the rest of the system is designed to make it work.


No, because even if they're not sold as new (which as others have commented is often not the case), they're still competing with you for sales. Someone who would have paid full price for a new one instead gets a version with a slight issue at 25% off. That's fine if you're the one selling it at a discount, but here you've lost money on the production and are now losing even more money because you've lost a sale of a full price unit.


I think the spirit of that regulation is so you as the producer see this as an incentive to better manage production so there is no need to discard/burn 10% of everything.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: