For models of this size, the code used to train them is going to be very custom ...

For models of this size, the code used to train them is going to be very custom to the architecture/cluster they are built on. It would be almost useless to anybody outside of Meta. The dataset would be more a lot more interesting, as it would at the very least show everybody how they got it to behave in certain ways.