As far as I can tell I don’t think it is correct to say that this isn’t a matrix. B is just written down in transposed form. Whether that makes the math more or less clear is something you can argue for or against, but it’s the same math and it is confusing to call it something else.
It is a tensor of rank two with a special binary operation on tensors. These objects aren't matrices in the mathematical sense any more than convolution kernels aren't.
Matrices come with the matrix product defined over them.
This is one of four possible closed first order tensor contractions for an order two tensor, viz. AijBik AijBki AijBjk AijBkj. Only the third is applicable to matrices, all other contractions only work for general tensors without transposition.
What we deal with in computer science are actually n dimensional arrays since we don't have the co and contravariant indices that define tensors in physics.
The thing described by the article can be summarized by: "Any time you see A x B, you replace it with A x B^T" and it would, in practice, be exactly the same. I'm not sure the author understood this because then then go on to do a bunch of performance checks to see if there is any difference between the two. Which there isn't, because under the hood it is the exact same operations. They just multiple columns into columns (or rows into rows) instead of rows into columns. But the implicit transpose would undo that.
You can note (correctly) that this doesn't line up with the precise, but arbitrary traditional definition of a matrix, and that is correct. But that is just word games because you can very simply, using only syntax and no calculations, transform one into the other.
https://en.wikipedia.org/wiki/Cracovian
The Cracovian product of two matrices, say A and B, is defined by
A ∧ B = BT A,
where BT and A are assumed compatible for the common (Cayley) type of matrix multiplication and BT is the transpose of B.
Since (AB)T = BT AT, the products (A ∧ B) ∧ C and A ∧ (B ∧ C) will generally be different; thus, Cracovian multiplication is non-associative.
A good reference how to use them and why they are useful is here (pdf):
https://archive.computerhistory.org/resources/access/text/20...