Very funny anecdote. What bothers me though is that this implies that Google see...

Tichy · on April 17, 2008

No, any web crawler would have done the same thing. It is simply an error to modify content with a GET.

meat-eater · on April 17, 2008

I agree. Google simply followed the the semantics of http. They had similar problems with the the Google Web Accelerator before. But it's really hard to blame google for the mistake of other programmers.

TFrancis · on April 17, 2008

Additionally, the interface to web crawlers (robots.txt) is well defined.

sdurkin · on April 17, 2008

Yes, it is an error. What alarms me is Google's attitude.

What happens when we're dealing with interfaces that are a little bit more ill-defined? Will Google continue to demand that you follow their way of doing things?

Google's attitude in this case suggests they will.

neilc · on April 17, 2008

It's not "their way of doing things", it is just the way the web works (per HTTP spec). Any crawler would have done the same thing in that situation -- that fact that it happened to be Google is merely coincidental. Given the scale at which they operate, you can't expect Google or any other web-scale crawler to be mind-readers.

marcus · on April 17, 2008

At least search engines can be told not to touch a certain link, if a bored user did the same thing who would you blame?