Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A real issue here is lack of training data (at least for LLMs). There's lots of high quality (and plenty more poor quality) open source software that can be used to train on. There's significantly less open source hardware and often the stuff that does exist is mostly front end design. Good examples of complete test benches (ones you'd close verification on and go to a production tape out with) are few and far between and there's basically nothing for modern physical design and backend considerations (i.e. how you take your design and actually manufacture a chip with it).

Commercial companies who may be interested in AI tools for EDA do have these things of course but are any going through the expensive process of fine tuning LLMs with them?

Indeed perhaps it's important to include a high quality corpus in pre training? I doubt anyone wants to train an LLM from scratch for EDA.

Perhaps NVidia are doing experiments here? They've got the unique combination of access to a decent corpus, cheaper training costs and in house know how.



I fine-tuned an LLM to do Verification IP wiring at a LLM hardware startup. We built the dataset in house. It was quite effective actually, with enough investment in expanding the dataset this is a totally viable application.


I'm curious: did you have to tailor your dataset around instruction-following/reasoning capabilities as well? No conflict of interest myself – I'm interested in hobby programming for vintage computers – but my understanding comes from Unsloth's fine-tuning instructions. [1]

[1] https://docs.unsloth.ai/basics/datasets-guide


No problem - although I'm out of that particular role, it's appropriate to discuss since the company shared these details already in an openAI press release a few months back.

I fine-tuned reasoning models (o1-mini and o3-mini) which were already well into instruction-following and reasoning behavior. The dataset I prepared was taking this into account, but it was just simple prompt/response pairs. Defining the task tightly, ensuring the dataset was of high quality, picking the right hyper parameters, and preparing the proper reward function (and modeling that against the API provided) were the keys to success.


That’s really cool. I’d love to see that process from up close.


> Indeed perhaps it's important to include a high quality corpus in pre training? I doubt anyone wants to train an LLM from scratch for EDA.

That does sound reasonable to me. The main problem is that you (at least for software) can't train on source code alone, as comments are human language, so you need some corpus of human language as well, so that the LLM learns that, next to the programming language(s). I'd assume it's the same as well.

Depending on what you're going for, you could take an existing pre-trained model, and further pretrain it on your EDA corpus. That means you'll have to reinvent or lift from somewhere else the entire finetuning data and pipeline, which is significantly harder than doing a finetune.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: