Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
queuebert
on March 17, 2024
|
parent
|
context
|
favorite
| on:
Show HN: Flash Attention in ~100 lines of CUDA
As a person who finds CUDA extremely easy to write and integrate, what does Triton have to offer?
whimsicalism
on March 17, 2024
[–]
block level rather than thread level programming, automatic optimization across hyperparameters, makes it much easier to write fast kernels
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: