I reckon you need tacit knowledge. Experience. Luckily in the order
of 100 hours not 10000.
Build a GPT using Python and Pytorch. For a good course: Andrej Karpathy is your keyword. At $1000 his course is great value. But actually it is free which is even better ;-)
It wont take you to flash attention but will ramp you to the point you could probably read papers about it. I almost got that far then life lifed me. But I was able to implement changes to the architecture of GPT and do some “hey mum I am doing SOTA (2021) machine learning”.
Build a GPT using Python and Pytorch. For a good course: Andrej Karpathy is your keyword. At $1000 his course is great value. But actually it is free which is even better ;-)
It wont take you to flash attention but will ramp you to the point you could probably read papers about it. I almost got that far then life lifed me. But I was able to implement changes to the architecture of GPT and do some “hey mum I am doing SOTA (2021) machine learning”.