Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Serious question: why do we still have 404 pages?

A 404 means the server can't find what I asked for, can't tell me where it went, has no idea what I'm looking for. By definition, my browser shouldn't show me the content that was returned-- it knows it's not what I want!

If I get a 404 code, the page shouldn't change. The browser should just show a message indicating that that resource doesn't exist, include the reason message if it's something other than "NOT FOUND", and let me either try a different link or correct my spelling. If I clicked a link to get there, it could even set a style on the link to indicate the resource doesn't exist.

It's strange how much we're still living in the nineties. I wonder how many of the use cases of Ajax could be replaced by an intelligent browser using the existing HTTP standard?



The server does have one way of getting an idea of what you’re looking for: the URL you requested. I’ve seen 404 pages that search the site from a slug in the URL and display the search results. There’s usually not enough in the URL for it to be useful, but I think once a site found the page I was looking for like that.

Also, if the 404 link was from another site, now that you’re on the site with the missing page, you can at least click their navigation and try to navigate to the resource yourself. Like if you follow a link titled “buy red yarn” from knitting-patterns.com to buy-yarn.com/products/8221 and get a 404, you can click Store and search for “red” and thus find buy-yarn.com/store/2539.


But that's not a 404 Not Found. That's what I'm saying: 404 literally means the server has no idea what you want and no way to find it. If the server can figure out what you want from the URL, the proper response is a 301 or 302. If it can't figure out exactly what you want, but has some guesses, it should be 300 Multiple Choices.

I'm not an HTTP hardass, but if we don't care about using the right response code then it might as well be 200, right? Because they're returning a resource for that URL, just a completely useless one.


I would disagree.

If I request twitter.com/userr but there is not account named "userr", should twitter decide I misspelled that and redirect me to "user" if that exists? I say no.

First of all, a 404 response needs to be sent regardless so that a script can check for 404 and handle that as deemed appropriate. If I got a 301 or 302 to what the server thought was what I was looking for, the script won't know that the server is now guessing at what I wanted. I assume this is important for bots like the google crawler.

Second, content with that 404 is very useful. It adds context to a user that cares. A well designed 404 will tell the user that the page/item/person/etc couldn't be found. Then it will offer an action such as suggestions of what it could be. Getting back to my "userr" example, there could be links to "user", "users", etc. There could be a prominent search feature or a way to report this to the company. Maybe a way to contact support, like a live chat widget. Further more, a 404 in your logs are very useful to look for what is commonly not found. Do people regularly try going to example.com/login even though your login page is at example.com/user_login? Apache access logs will show this very nicely and you can then decide to manually add a redirect.


To your first point, I don't see the issue. Your crawler hits a URL and gets 301. It's either redirected to a URL already in its canonical database, and therefore knows that the redirected URL is garbage, or it gets something new-- congratulations!

404 is for a resource which can't be found; if the resource can be found, 404 is wrong for scripts as much as for users.

Your second point is valid, but a little past the point, I think. My only actions after seeing a 404 are to check my spelling, hit back, close the tab, or start from the site root. Maybe it would be good for my browser to redirect to / or something by default, assuming there isn't a page loaded; I'm open to ideas. I'm just saying, it's never been helpful to me to download a 1.2mb transparent png that says "Sorry your browser doesn't know what 404 means."

(Or another example ripped from real life: I literally just now hit a link to a video on a third-tier media site and was rewarded with a completely different video, with little text at the top that says "Error: Not found". This is what people really use custom 404s for.)


The webmaster wanted you to see it. If browsers didn't show the content of a 404, webmasters would just send you the same "not found" page but with a 200 code, which would be even worse.


Well, no, it would be the same. That's my point; the conventional treatment of 404 on the web is a 200 that says "Sorry." So what's the point of it?


It would be the same for the browser user agent, but a lot of other user agents might have trouble. Anyway, why are you advocating a change that you recognize will have no effect?


? The change I'm advocating is in browser behavior, and would definitely make a difference for any site that complies with HTTP. There's nothing preventing malicious hosts from continuing to serve pages full of ads, but then, there never was.


It sounds like he is claiming that if a major browser changed this behavior, websites would just reconfigure their response to give a 200 response error page because they would prefer if the browser showed something like this rather than a browser dependent error message. That would end up making 404 completely useless, even for actual tools like wget because it would be against webmasters interests to use them anymore.

Do you disagree with that claim?


Well, my counterargument is that a change in browser behavior would make sites more usable, encouraging the proper use of 404. Of course, if that weren't the case, I'd agree there'd be no point in changing :P


If webmasters wanted a spartan base error they could easily have one though.


I don't really see a problem? I think the point is you wanted to go to foobar.com/naughtygirls but it's actually foobar.com/naughty-girls. At the very least, the 404 brings you to the right website, presumably, ideally with a coherent enough sitemap/navigation to get you to your desired place.

Though, I wonder how often 404s actually happen outside of people not properly setting up redirects. I imagine 95% of people just type in domains or use bookmarks.


There's definitely a use case there-- but it should be configurable behavior to request or redirect to a site map on 404. I may or may not want to trawl through Photobucket to find that missing image; I'd just like my browser to give me the choice before erasing my current page state.


This is reasonable. I always like being given a choice. Though, I'm sure this could be overachieved with an extension (assuming people properly setup their 404's).


You would be surprised at how many sites do not send a 404 response header.


Sure, because why would they? It doesn't change (almost) anything about browser behavior. If anything, what I'm proposing would provide an incentive to comply to the standard, since it would make your page more usable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: