Imo this is because the responsibility is not clearly defined and can be argued ...

usrbinbash · on March 9, 2022

> but you choose to redirect the pipe to a different location

The output doesn't go into a pipe however, the output goes to /dev/full. Redirection happens before the process is started, so the program is, in fact, writing directly to a file descriptor that returns an error upon write.

CraigJPerry · on March 9, 2022

In this scenario you didn’t write any bytes though. You made a call to write to standard out (your process’s file handle 1) and didn’t succeed, you didn’t handle the possible error condition, you just silently ignored it.

I think this is pretty cut and dried - the failure is inside your process’s address space and the programmer error is that you haven’t handled a reported error.

>> what happens to the bytes AFTER the pipe?

There isn’t a pipe involved here, when your process was created it’s stdout was connected to dev/full then your program began executing

usrbinbash · on March 9, 2022

> you didn’t handle the possible error condition, you just silently ignored it.

Problem is, the error condition is not even that obvious. I tried it, and printf() will happily return the number of bytes written, even when redirecting stdout to /dev/full.

I am not 100% sure, but I think this has to do with the fact that printf uses buffered io, and writing the bytes to the buffer will work. It's only when the buffer is flushed that this will become a problem, but this would need to be handled in the code to show an error message.

hnlmorg · on March 9, 2022

I don’t personally agree with that judgement. While the failure condition is at the OS level it’s still affecting the function of the program an in unexpected way.

Plus the whole point of STDOUT is that it is a file. So it shouldn’t change the developers mental model if that file happens to be a pseudo TTY, a pipe or a traditional persistent file system object. This flexibility is one of the core principles of UNIX and it’s what makes the POSIX command line as flexible as it is.

contradictioned · on March 9, 2022

With that in mind, is this criticism of the Java hello world valid? Its output abstracts more than stdout and maybe in Windows this but would not occur. (I don't know, just discussing)

oneeyedpigeon · on March 9, 2022

I feel like this is misrepresenting the article's point which isn't literally "hello world is buggy if it returns success on failure" but more "you should do error-handling". In this very specific case, you can argue that it's irrelevant. But if your program writes to a log file, or writes to a data file that it later reads from, it had better include some error-handling.

The fact that there's redirection is a ... misdirection. The redirection is only used to proxy a real-life case that can happen even when no redirection is taking place.

autoexec · on March 9, 2022

What's the real-life case where hello world fails because the file system is full?

You could do all kinds of things that would cause hello world to "fail". A broken monitor (or even one unplugged) wouldn't show "hello world" or give any indication of an error too, but it's hardly the codes fault. The code does what it's supposed to and ignores all kinds of other things that could go horribly wrong. That's not really a bug, just a known and expected limitation of the program's scope.

oneeyedpigeon · on March 9, 2022

As I said, not hello world necessarily, but any program that writes output can encounter this problem, and there's an overlap with general file-writing even.

ygra · on March 9, 2022

I thought about this too a bit just now. But I think it's not the shell setting up stuff outside your process that then fails. Rather you already get handles to the "full" file system at process creation and then it's your problem. And traditionally, the behaviour you get from all the standard streams is very unpredictable, depending on where they point.

mirekrusin · on March 9, 2022

Your program has a bug because it can write nothing or just part and will always return zero exit code. Ie think about using your program as part of bash script where you often rely on process exit codes.

dataflow · on March 9, 2022

> is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

No, but it's not "after". Rather, it's your responsibility to handle backpressure by ensuring the bytes were written to the pipe successfully in the first place.

This isn't just about the filesystem being full btw. If you imagine a command like ./foo.py | head -n 10, it only makes sense for the 'head' command to close the pipe when it's done, and foo.py should be able to detect this and stop printing any more output. (This is especially important if you consider that foo.py might produce infinite lines of output, like the 'yes' program.)

I would argue this is not necessarily even an error from a user standpoint, so the return code from food.py should still be zero in many cases—a pipe-is-closed error just means the consumer simply didn't want the rest of the output, which is fine [1], whereas an out-of-disk-space error is probably really an error. Handling these robustly is actually difficult though, because (a) you'd need to figure out why printf() failed (so that you can treat different failures differently—but it's painful), and (b) you need to make sure any side effects in the program flow up to the printf() are semantically correct "prefixes" of the overall side effect, meaning that you'd need to pay careful attention to where you printf(). (Practically speaking, this makes it difficult to even have side effects that respect this, but that's an inherent problem/limitation of the pipeline model...)

FWIW, I would be very curious if anyone has formalized all of these nuances of the pipeline model and come up with a robust & systematic way to handle them. It seems like a complicated problem to me. To give just one example of a problem that I'm thinking of: should stderr and stdout behave the same way with respect to "pipe is closed"? e.g. should the program terminate if either is closed, or if both are closed? The answer is probably "it depends", but on what exactly? What if they're redirected externally? What if they're redirected internally? Is there a pattern you can follow to get it right most of the time? There's a lot of room for analysis of the issues that can come up, especially when you throw buffering/threading/etc. into the mix...

[1] Or maybe it isn't. Maybe the output (say, some archive format like ZIP) has a footer that needs to be read first, and it would be corrupt otherwise. Or maybe that's fine anyway, because the consumer should already understand you're outputting a ZIP, and it's on them if they want partial output. As always, "it depends". But I think a premature stdout closure is usually best treated as not-an-error.

hvdijk · on March 9, 2022

> This isn't just about the filesystem being full btw. If you imagine a command like ./foo.py | head -n 10, it only makes sense for the 'head' command to close the pipe when it's done, and foo.py should be able to detect this and stop printing any more output.

The usual way of handling this is by not (explicitly) handling it. Writes to a closed pipe are special, they do not normally fail with a status that the program then all too often ignores, they result in a SIGPIPE signal that defaults to killing the process. Extra steps are needed to not kill the process. No other kind of write error gets this special treatment that I am aware of.

dataflow · on March 9, 2022

That would be at best a Linux extension, not some general C behavior you can assume when writing your program.

That said though, I can't even reproduce what you're saying on Linux:

  printf '%s\n' '#include <stdio.h>' 'int main() { setvbuf(stdout, NULL, _IONBF, 0); int r = fputs("Starting\n", stdout); fflush(stdout); fprintf(stderr, "%d\n", r); }' | cc -x c - && ./a.out >&-
  // Prints '0' instead of dying

hvdijk · on March 9, 2022

That's a POSIX thing. It doesn't apply to all C implementations but it does apply to many more than just Linux-based ones. You've not got a closed pipe so you wouldn't see it, you've just got a closed file descriptor. Try running it as

  ./a.out | :

and you will probably see it. I say probably because there is a timing aspect as well, the write may happen before the pipe gets closed in which case it will not fail, but it is unlikely to.

dataflow · on March 9, 2022

Yeah I should've said POSIX, my bad. But yeah my point was it's not plain C behavior.

And yes on Linux I do see it with your no-op example now. Though for some reason not with 'head'... what's going on? Is it not closing the pipe when it exits?

  $ printf '%s\n' '#include <stdio.h>' '#include <unistd.h>' 'int main() { setvbuf(stdout, NULL, _IONBF, 0); int r = puts("Starting...\n"); r += fputs("First\n", stdout); fflush(stdout); usleep(1000000); fprintf(stderr, "%d\n", r); }' | cc -x c - && ./a.out | head -n 1
  Starting...
  19

Edit: D'oh, see below.

hvdijk · on March 9, 2022

That would be the timing aspect of it. You have:

a) a.out writes line 1

b) a.out writes line 2

c) head reads line 1

d) head closes the pipe

We know that b and c both happen after a, and that d happens after c. However, we do not know whether b happens before c, between c and d, or after d. Your a.out process will only get killed by SIGPIPE if it happens after d.

On my system, running a.out under strace is enough to slow it down enough to affect the timing and see the SIGPIPE you were expecting. You may alternatively insert artificial delays in your test program such as by calling the sleep() function between the two lines of output to see the same result.

dataflow · on March 9, 2022

Sorry, I think I edited my comment while you were replying. But I just noticed the problem in the most recent version was that I didn't write to stdout after the usleep(), so it never raised SIGPIPE. Thanks.

jstanley · on March 9, 2022

> it's not plain C behavior.

But pipes aren't a C thing in the first place. "unistd.h" is not a C thing, file descriptors aren't a C thing.

dataflow · on March 9, 2022

The C program would just be using stdio to interact with pipes.

And not every platform with pipes supports SIGPIPE with such behavior.

hgomersall · on March 9, 2022

What the pipe does is orthogonal to what the programme should do. The problem here is that errors are not being handled. There are languages such as rust that enforce error handling, whereby the policy on error is made explicit. The nuances you highlight are around what the errors should describe, which ultimately leads to more potential granularity in the error policy.

masklinn · on March 9, 2022

> If my program writes to the standard output, but you choose to redirect the pipe to a different location, is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

The pipe is your standard output. Your very program is created with the pipe as its stdout.

> After all: my program did output everything as expected. The part which fucked up was not part of my program.

But you are wrong, your program did not output everything as expected, and it failed to report that information.