This is a good list that includes a lot of things most people miss. I would also suggest:
1. Tight targeting of your users in an AB test. This can be through proper exposure logging, or aiming at users down-funnel if you’re actually running a down-funnel experiment. If your new iOS and Android feature is going to be launched separately, then separate the experiments.
2. Making sure your experiment runs in 7-day increments. Averaging out weekly seasonality can be important in reducing variance but also ensures your results accurately predict the effect of a full rollout.
Everything mentioned in this article, including stratified sampling and CUPED are available, out-of-the-box on Statsig. Disclaimer: I’m the founder, and this response was shared by our DS Lead.
> 2. Making sure your experiment runs in 7-day increments. Averaging out weekly seasonality can be important in reducing variance but also ensures your results accurately predict the effect of a full rollout.
There are of course many seasonalities: day/nigh, weekly, monthly, yearly seasonality, so it can be difficult to decide how broad you want to collect data. But I remember interviewing at a very large online retailer and they did their a/b tests in an hour because they "would collect enough data points to be statistical significant" and that never sat right with me.
1. Tight targeting of your users in an AB test. This can be through proper exposure logging, or aiming at users down-funnel if you’re actually running a down-funnel experiment. If your new iOS and Android feature is going to be launched separately, then separate the experiments.
2. Making sure your experiment runs in 7-day increments. Averaging out weekly seasonality can be important in reducing variance but also ensures your results accurately predict the effect of a full rollout.
Everything mentioned in this article, including stratified sampling and CUPED are available, out-of-the-box on Statsig. Disclaimer: I’m the founder, and this response was shared by our DS Lead.