So in other words, it makes the system look better. Look, disk time counts, whet...

antonmks · on Oct 1, 2013

Disk time matters only when you read your data first time. The following queries won't have to read from the disk. The compressed data may sit in memory for days and the queries won't touch the disk. Now, if your compressed data doesn't fit into memory, then disk speeds of course would matter a lot, on this I agree with you.

DannyBee · on Oct 1, 2013

You are making a lot of assumptions about working set sizes, etc. In any case, even if only once, it is still a cost you are paying, and a cost hadoop is paying, and it is completely wrong to simply subtract it out when comparing performance.