Because most of the time, it reduces "test precision".
Which means when such a test detects an issue, there's now a bigger area of the code where the issue could be located.
> Doesn't efficiency outweigh respecting feature boundaries in tests?
In the end, what we're trying to minimize here is feature development time.
This of course depends on the feedback loop duration (build + run tests), which is why test efficiency is important ; but there are other factors.
Feature development depends on the time it takes to locate the error when a test fails ; a test signalling that "something is wrong somewhere" is not very usefull (in some cases, it can be, if it runs extremly fast - because you can generally see the error with "git diff").
However, infinitely precise tests are often undesirable, because they tend to have extremely rigid expectations about the behavior/API of the code under test, discouraging refactoring, and thus slowing down the development process.
Let's keep in mind that "testing = freezing". More precisely, you're freezing what your tests depend upon (by adding friction to future modifications).
So be careful what you freeze: the initial intent of testing is to increase code flexibility. If you can't modify anything without breaking a test, you're probably missing their benefit.
> Because most of the time, it reduces "test precision". Which means when such a test detects an issue, there's now a bigger area of the code where the issue could be located.
Tests are not a debug tool. Tests are here to tell you when you broke something.
When this happen you can get your debugging toolbox out: stack traces, profilers and things like GDB. And then follow the steps your test script did.
All the projects I've most enjoyed working on had clear, explicit test failures, which almost always gave me enough information to hunt down what I'd broken and where. Obviously that's not realistic for some kinds of projects. But even in a legacy C project, if I had to pull out stack traces and profilers and GDB every single time I wanted to understand why a test failed, I would say I'm working in a pretty terrible codebase.
But "when you click on this after that this happen while it shouldn't" is usually enough info to start debugging: you have reproducible steps. Which you can do with a debugger running so you see exactly were and how things break.
And it is a lot less brittle than "well we tried to refactor some minor thing and now everything is red; but we don't know if the software behavior changed or just because our tests were just checking the implementation".
well, if you're determined to write stupid tests you can do it at any level. And that is not enough info to start debugging unless you had the machine in a known and reproducible state to begin with, which is where we get back to writing focused reproducible tests.