All this "quirks mode" stuff is part of the specification, no? It makes it all a...

michelb · on April 11, 2023

Quitks mode is one thing, but most browsers have specific rules for specific websites, a manual process to update and handle those cases. Pretty sure chrome and safari have hundreds of these rules.

vermilingua · on April 11, 2023

Do you have any references? I'd be interested to see the list and what workarounds are needed

Devasta · on April 11, 2023

https://github.com/WebKit/WebKit/blob/main/Source/WebCore/pa...

Don't know if this is everything, but there are a bunch of specific websites mentioned in here.

arp242 · on April 11, 2023

It's not clear to me if those are due to shortcomings in WebKit, the site, or if it's to be "bug-compatible" with anything else. Either way, 1,600 lines of code doesn't seem a lot to me.

zelphirkalt · on April 11, 2023

If anything, websites have become way less clean and invalid HTML. I remember people, including myself, putting W3C validator icons on websites. Rarely do I see any these days, because of all the invalid HTML and dynamically created websites. Maybe all the tags are closed nowadays, so maybe at least that. But which elements are used inside which other elements and whether they are semantically appropriately used is another matter.

dfox · on April 11, 2023

One of the ideas behind HTML5 is that while there is some concept of validity and well-formedness, essentially any random stream bytes describes exactly one DOM tree, in some cases the resulting tree is surprising, but even then should be same across all conformant parsers (modulo scripting support).

The end result is that validation is not that much interesting anymore, because the idea was that valid (X)HTML document should parse the same accross all browsers (which it mostly did, but that did not say much about how it was actually rendered).

arp242 · on April 11, 2023

Like most people I gave up on the whole semantic pedantry a long time do. Correct header ordering, basic semantics like <nav>: sure, that's great. But "no <p> inside <dt> allowed!" just makes no real sense and is exceedingly pedantic.

The validator badges were kind of a backlash against the tag soup of the day; part of the reason for that was that everyone who knew how to program a VCR could get employed as a "webmaster" in those days, but also because the authoring tools for non-tech authors weren't as good. HN sees a lot of posts from non-tech people, often written on WordPress, Medium, or whatnot. 25 years ago it would more likely have been "tag-soup'd" by some non-tech person who just learned a bit of HTML.

chrismorgan · on April 11, 2023

Those badges regularly did not reflect reality.

Nowadays, HTML parsing is exhaustively defined in the form of a couple of state machines, so it’ll behave the same everywhere. It’s genuinely easy to implement perfectly (though it’ll still take a while because there is quite a bit of it).