This is a truly excellent post, but Anish (humbly) buries the lede: Anish implemented a really fast checker called Porcupine which is available here https://github.com/anishathalye/porcupine. Would love to hear more on how this helped you catch any bugs & how the diagnosis process goes from a failed check to bugfix.
Linearizability testing does help catch bugs, but in this implementation, it doesn't offer much guidance on how to go about finding bugs. That's something I'm currently thinking about, as this style of testing is something that I'm thinking of incorporating into MIT's 6.824 Distributed Systems class. We care about testing student's implementations for correctness, but giving good feedback that helps students find bugs is even more important than just telling students whether or not their implementations are buggy.