I've used it! My gripe is that I'm not fond of their async implementations, it blows up the code when I want to do something simple. I understand the tradeoff is there in order to utilize external hardware for bigger tensor applications, but sometimes I just want a thin API for n-dimensional operations... like ... well, you know what I'm going to say. :)