I think the idea is that the short functions call other (short) functions at a lower conceptual level to create a large amount of functionality. By testing that function, you're testing how they work together.
This is different from short functions being short because they don't do much.