Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very funny anecdote. What bothers me though is that this implies that Google seems to think they own the web.

For example, they told the guy that he was lucky that they were willing to give him a backup, but it seems to me that Google's the one that should be taking responsibility for their actions.

Its a short jump from, "you have to use POST or we'll delete your stuff" to "you have to follow Google standard X or we won't index your site."

Welcome to the first web empire.



No, any web crawler would have done the same thing. It is simply an error to modify content with a GET.


I agree. Google simply followed the the semantics of http. They had similar problems with the the Google Web Accelerator before. But it's really hard to blame google for the mistake of other programmers.


Additionally, the interface to web crawlers (robots.txt) is well defined.


Yes, it is an error. What alarms me is Google's attitude.

What happens when we're dealing with interfaces that are a little bit more ill-defined? Will Google continue to demand that you follow their way of doing things?

Google's attitude in this case suggests they will.


It's not "their way of doing things", it is just the way the web works (per HTTP spec). Any crawler would have done the same thing in that situation -- that fact that it happened to be Google is merely coincidental. Given the scale at which they operate, you can't expect Google or any other web-scale crawler to be mind-readers.


At least search engines can be told not to touch a certain link, if a bored user did the same thing who would you blame?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: