Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google Now Fills Out Forms & Crawls Results (searchengineland.com)
34 points by raghus on April 11, 2008 | hide | past | favorite | 18 comments


I noticed this happening in my blog's logs months ago, for my blog's search form. I thought it was fascinating. The keywords it chose to search were generally meaningful words that occurred frequently on the site. I'm not sure if they were correlated with the Google search terms that hit my blog.

The corresponding SEO tip is to make sure that your site's search feature works, delivers good results for the keywords and delivers the results to which you want Google to pay attention.


Wow, this is kind of a big deal. Crawling the deep web seemed like the one real feature Google's vaporware competitors could tout.


The deep web isn't limited to GET requests and forms that don't require passwords.


Yes, but this is a start... give it a couple years.

And this is still more than the vaporware competitors that are still vaporware.


I propose a new sport - Google Crawler hunting. Kind of like Snipe hunting, only you lay traps. Object is to see how long you can keep the Google bot there by generating new pages programmatically, and how many unique things you can get the bot to do.


Here's Matt Cutts explaining it a little more: http://www.mattcutts.com/blog/solved-another-common-site-rev...


So its ok for them to do this? but if you do the same against google (scraping results) - presumably Bad Things happen?


Is their search url listed in their robots.txt?


Seems strange that you wouldn't have just looked when asking that question (less characters to type at least). Lots of things disallowed.

http://google.com/robots.txt


I think it was a rhetorical question.


Yes it was. But I should have been clearer for the type of literal argument starters that would tend to hang around here ;)

I wonder if google always honour robots.txt


I hope so because otherwise they could get stuck in one of those:

http://en.wikipedia.org/wiki/Spider_trap


What an interesting problem... so, if you were to build a searchbot, what would it type into those form fields it found?

It has to be very difficult, especially since you can't do too many submits. Could you somehow infer what to put in there based on searches in your own database already?


What happens when google starts editing wiki pages and filing new bug reports/tickets?


It only submits GET forms, you shouldn't use those for those things.


I was sort of hoping they had a valid credit card number:-)


text Google Now Fills Out 1 minute ago link


scaling Ron Paul Graham




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: