Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is awesome, scary and very interesting. But, for me, it comes with a personal concern:

For some time I've been giving serious thought about an automated web service generator. Given a data model and information about the data (relationships, intents, groupings, etc.) output a fully deployable service. From unit tests through container definitions, and everything I can think of in-between (docs, OpenAPI spec, log forwarder, etc.)

So far, while my investment hasn't been very large, I have to ask myself: "Is it worth it?"

Watching this AI code generation stuff closely, I've been telling myself the story that the AI-generated code is not "provable". A deterministic system (like I've been imagining) would be "provable". Bugs or other unintended consequences would be directly traceable to the code generator itself. With AI code generation, there's no real way to know for sure (currently).

Some leading questions (for me) come down to:

1. Are the sources used by the AI's learning phase trustworthy? (e.g. When will models be sophisticated enough to be trained to avoid some potentially problematic solutions?)

2. How would an AI-generated solution be maintained over time? (e.g. When can AI prompt + context be saved and re-used later?)

3. How is my (potentially proprietary) solution protected? (e.g. When can my company host a viable trained model in a proprietary environment?)"

I want to say that my idea is worth it because the answers to these questions are (currently) not great (IMO) for the AI-generated world.

But, the world is not static. At some point, AI code generators will be 10x or 100x more powerful. I'm confident that, at some point, these code generators will easily surpass my 20+ years of experience. And, company-hosted, trained AI models will most likely happen. And context storage and re-use will (by demand) find a solution. And trust will eventually be accomplished by "proof is in the pudding" logic.

Basically, barring laws governing AI, my project doesn't stand a cold chance in hell. I knew this would happen at some point, but I was thinking more like a 5-10 year timeframe. Now, I realize, it could be 5-10 months.



Not OP but I've been playing with similar technology as a hobby.

>1. Are the sources used by the AI's learning phase trustworthy? (e.g. When will models be sophisticated enough to be trained to avoid some potentially problematic solutions?)

Probably not, but for most domains reviewing the code should be faster than writing it.

>2. How would an AI-generated solution be maintained over time?

I would imagine you don't save the original prompts. Rather, when you want to make changes you just give the AI the current project and a list of changes to make. Copilot can do this to some extent already. You'd have to do some creative prompting to get around context size limitations, maybe giving it a skeleton of the entire project and then giving actual code only on demand.

> When can my company host a viable trained model in a proprietary environment?

Hopefully soon. Finetuned LLaMA would not be far off GPT-3.5, but nowhere close to GPT-4. And even then there are licencing concerns.


Ok, a couple of derivative "fears" around this...

1> Relying on code reviews has concerns, IMO. For example, how many engineers actually review the code in their dependencies? (But, I guess it wouldn't take that much to develop an adversarial "code review" AI?)

2> Yes, agreed, that would work. Provided the original solution had viable tests, the 2nd (or additional) rounds would have something to keep the changes grounded. In fact, perhaps the existing tests are enough? Making the next AI version of the solution truly "agile"?

3> So, at my age (yes, getting older) I'm led to a single, tongue-in-cheek / greedy question: How to invest in these AI-trained data sets?


> Given a data model and information about the data (relationships, intents, groupings, etc.) output a fully deployable service. From unit tests through container definitions, and everything I can think of in-between (docs, OpenAPI spec, log forwarder, etc.)

AWS roughly has one of these in Amplify. The data mapping parts are pretty great, though lots of the rest of it suck. The question I'd ask is if those parts suck by nature of the setup, or is it just amplify that has weird assumptions


Thanks for the questions. I’ll reply just a bit later. I’m in a train and reading/writing makes my stomach sick




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: