I know, but it’s not “just bytes” as per parent comment. You cannot infer the le...

ElevenLathe · on April 22, 2024

Less specific interfaces let you do less interesting things, but are more resilient. It's an engineering tradeoff. Purpose-built interfaces that fully expose and understand domain-level semantics are great in certain circumstances, but other times you want a certain minimum abstraction (IP packets and 'bags-of-bytes' POSIX file semantics are good examples) that can be used to build better ones.

If the rollout of HTTP had required that all the IP routers on the internet be updated to account for it, we likely would not have it. Likewise, if we required that all the classic Unix text utilities like wc, sort, paste, etc. did meaningful things to JSON before we could standardize JSON, adoption would likely have suffered.

rusk · on April 22, 2024

The basic unix tools do account for variable width though. Variable chars are baked into most OS. When you use these commands the decode is implicit.

samatman · on April 22, 2024

You can transport it to an architecture of different endianness without loss of information or metadata and a transformation at destination.

There are important ways in which it is, in fact, "just bytes".

rusk · on April 22, 2024

Endianness etc is a feature of the encoding. Most JSON implementations I’ve used require the raw bytes to first be decoded as such.

samatman · on April 23, 2024

No, endianness is not a feature of UTF-8 encoding. There isn't a UTF-8LE and a UTF-8BE. That's because the codeunit is bytes.

Forget "decoding", you have to parse JSON. But you don't have to figure out how it's encoded first. Because it's a byte format. You already know.

rusk · on April 23, 2024

There isn’t a UTF-8LE/BE because it is implicitly BE for wide characters. Any byte in a WC sequence cannot meaningfully be interpreted (exc character class, page etc) without its companions, so not just bytes. There is an element of presentation that must happen before “mere bytes” are eligible for JSON