Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I get the impression most llama.cpp users are interested in running models on GPU. AFAICT this optimization is CPU-only. Don't get me wrong – a huge one! – and opens the door to running llama.cpp on more and more edge devices.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: