If you see anything wrong with it, or odd, feel free to share.
I'm still investigating what is happening to make the run so slow, so if you can find something wrong in my code, that would help.
Replacing math.mod(n, i) with (n % i) gives roughly 9.4x performance.
EDIT: luajit version was LuaJIT 2.0.4 on Mac OSX
If you see anything wrong with it, or odd, feel free to share.
I'm still investigating what is happening to make the run so slow, so if you can find something wrong in my code, that would help.