Spoiler alert: not good enough to break CUDA's moat

qeternity · on March 15, 2024

This is not CUDA's moat. That is on the R&D/training side.

Inference side is partly about performance, but mostly about cost per token.

And given that there has been a ton of standardization around LLaMA architectures, AMD/ROCm can target this much more easily, and still take a nice chunk of the inference market for non-SOTA models.

imtringued · on March 16, 2024

Hypotheticals don't matter. The average user won't have the most expensive GPU and when it comes to VRAM AMD is half as expensive so they lead in this area.

bornfreddy · on March 15, 2024

Not sure why you're downvoted, but as far as I've heard AMD cards can't beat 4090 - yet.

Still, I think AMD will catch or overtake NVidia in hardware soon, but software is a bigger problem. Hopefully the opensource strategy will pay off for them.

arein3 · on March 15, 2024

Really hope so, maybe this time will catch and last

Usually when corps open source stuff to get adoption, they stuff the adopters after they gain enough market share and the cycle repeats again

nerdix · on March 15, 2024

A RTX 4090 is about twice the price of and 50%-ish faster than AMD's most expensive consumer card so I'm not sure anyone really expects it to ever surpass a 4090.

A 7900 XTX beating a RTX 4080 at inference is probably a more realistic goal though I'm not sure how they compare right now.

Zambyte · on March 15, 2024

The 4080 is $1k for 16gb of VRAM, and the 7900 is $1k for 24gb of VRAM. Unless you're constantly hammering it with requests, the extra speed you may get with CUDA on a 4080 is basically irrelevant when you can run much better models at a reasonable speed.