So the obvious thing to do... Send a patch to change the "copy_user_generic" kernel method to use a different memory copying implementation when the CPU is detected to be a bad one and the memory alignment is one that triggers the slowness bug...
Not obvious. Seems like if it can be corrected with microcode just have people use updated microcode rather than litter the kernel with fixes that are effectively patchable software problems.
The accepted fix would not be trivial to anyone not already experienced with the kernel. But more important, it obviously isn’t obvious what is the right way to enable the workaround. The best way is to probably measure at boot time, otherwise how do you know which models and steppings are affected.
I don't think AMD does microcode updates for performance issues do they? I thought it was strictly correctness or security issues.
If the vendor won't patch it, then a workaround is the next best thing. There shouldn't be many - that's why all copying code is in just a handful of functions.
A significant performance degradation due to normal use of the instruction (FSRM) not otherwise documented is a correctness problem. Especially considering that the workaround is to avoid using the CPU feature in many cases. People pay for this CPU feature now they need kernel tooling to warn them when they fallback to some slower workaround because of an alignment issue way up the stack.
If AMD has a performance issue and doesn't fix it, AMD should pay the negative publicity costs rather than kernel and library authors adding exceptions. IMHO.
It’s not a trivial fix. Besides the fix likely being in microcode (where AMD figures out why aliasing is broke for addresses that are close to page-aligned), even a software mitigation would be complex because the kernel cannot actually use vector instructions that are typically used for the fallback path when ERMS is not available.