So in other words, it makes the system look better.
Look, disk time counts, whether it's hadoop loading it into the memory of a given machine, or you reading it from disk and transferring it into a GPU piece by piece.
Your "hopefully it will be irrelevant" is, well, crazy.
I work for an employer with plenty of in-memory systems (very large ones in fact), and it certainly doesn't discount disk time. In fact, it matters a lot!
Disk time matters only when you read your data first time. The following queries won't have to read from the disk. The compressed data may sit in memory for days and the queries won't touch the disk. Now, if your compressed data doesn't fit into memory, then disk speeds of course would matter a lot, on this I agree with you.
You are making a lot of assumptions about working set sizes, etc.
In any case, even if only once, it is still a cost you are paying, and a cost hadoop is paying, and it is completely wrong to simply subtract it out when comparing performance.
Look, disk time counts, whether it's hadoop loading it into the memory of a given machine, or you reading it from disk and transferring it into a GPU piece by piece.
Your "hopefully it will be irrelevant" is, well, crazy. I work for an employer with plenty of in-memory systems (very large ones in fact), and it certainly doesn't discount disk time. In fact, it matters a lot!