Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nontemporal writes are substantially slower, e.g. with avx512 you can do 1 64 byte nontemporal write every 5 or so clock cycles. That puts you at >= 640 cycles for 8 KiB. https://uops.info/html-instr/VMOVNTPS_M512_ZMM.html


Well, the point of a non-temporal write kind of is that you don't care how fast it is. (Since if it was being read again anytime soon, you'd want it in the cache.)

But yes, it can be an over-optimization.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: