I never understand why people always seem to think 'test flakiness' should/can b...

pydry · on Sept 12, 2022

It depends entirely on the reason for that 1/10 times.

E.g. I had a test that hit a 3rd party sandbox that would go down occasionally. I could have mocked it to avoid this but I had way bigger fish to fry. The flakiness triggered a very specific error that we felt safe ignoring (other kinds of flakiness we would not ignore). The corresponding prod API never had issues.

Sometimes it's a select statement without an order by in which case even if it's not strictly a bug but just quickly adding an order by solves the flakiness forever so you might as well.

Sometimes it's a pretty terrifying race condition in the code that will absolutely crop up in prod.

jamesfinlayson · on Sept 12, 2022

Agreed - at a previous job lots of time was invested in setting up test retries, and splitting test runs out so that you could retrigger only the subsets that failed instead of the whole test suite.

ChadNauseam · on Sept 12, 2022

That's a really good point. It would be interesting to run the tests 10 times and have the CI fail if it doesn't pass all 10 times.