In that case, Google Webmaster tools is not actually reporting an error. That's a report to show you what URLs Google tried to crawl but couldn't (due to being blocked) so you can review it and ensure that you are not accidentally blocking URLs that you want to have indexed.
I agree that it's confusing in that the report is in the "crawl errors" section.
(I built Google webmaster tools so this confusion is entirely my fault; but I don't work at Google anymore so sadly I can't fix this.)
Really cool study, thanks for sharing. I don't think the research proves that social media wouldn't be a worthwhile signal. They really don't look into how bookmarking data could be applied to results, they just look at how bookmarked sites relate to search results and the web at large. The strongest conclusion is that Delicious users only bookmark a sliver of the web's content, and while the researchers considers this a weakness. I would consider it a strength.
The way I see it, search engines are entirely inclusive, while bookmarking sites are selectively inclusive. For certain ultra-specific queries, I'd much prefer the search engine approach -- I want as much breadth as possible. For other not-so-specific queries, I'd much prefer the result sites to be vouched for by people. Looking one step further, probably for a decent chunked size of my search queries, I'm OK if the number of sites it is searching over is only about 10,000,000 (my estimate for how many sites on Delicious have been bookmarked by more than 20 users).
What I find exciting is that the two can be effectively combined. Take all the normal results that a search engine would show, but give it a boost based on the log of how many times it's been bookmarked. It really is that simple.
I've tried that [before], but with multiple parameters, I simply got zero results. I then enclosed the site:foo OR site:bar within parentheses, leaving the rest of the query outside parentheses. Whichever site:foo came first would be matched, but only it appeared to be matched. Subsequent site:bar, etc. parameters did not produce any matches in the results.
I'll have a look at your links, in a minute -- probably should have done that first.
EDIT: Well, your second link sure works. I wonder what I was doing wrong. When I saw someone else('s or s') comments online to the effect that it didn't work, I assumed that was why my own experience failed.
I'm in serious need of some coffee, at the moment. Thanks for correcting my mis-impression; I look forward to straightening it out in my mind and taking advantage of this feature.
To answer a few of the questions asked in the comments:
My interpretation of what the combo algorithmic and manual efforts is this:
-One of Google's paid link algorithms (possibly a new one or possibly an existing one that was recently tweaked) flagged some of the links or one of the link networks. This caused those links to no longer count towards PageRank credit (and possibly causing some of the initial rankings drops, such as the ones from position 1 to position 7).
-When Google was alerted to the issue, they took a closer look and on manual inspection found not only additional problematic links but also other spammy issues (if you follow the link in my story to the blog post by the guy who helped NYT with the investigation, you'll see that the SEO firm set up doorway pages and that the jcp pages themselves have keyword stuffing and hidden links on them). Based on that manual review, Google added a manual penalty to the site.
That's why my conclusion is that once they fix the issue, the manual penalty will be removed and they'll rise a bit in ranking position. But since the algorithmic penalty simply (I'm speculating) caused some of the paid links to be devalued, there would be no "lifting" of this penalty.
It is very disheartening that something so vital to business success (understanding how to operate online; build a web site with good site architecture; engaging with searchers; solving their problems) is so much equated with these types of tactics.
I posted more as a comment on the original story, but I have covered this issue in depth (from when Google initially proposed it, to when it was launched) here:
Of course, a better solution is some type of progressive enhancement that ensures both that search engines can crawl the URLs and anyone using device without JavaScript support can view all of the content and navigate the site.
Well, I'm not seeing that result now, likely because the results are now filled with actual news about hiybbprqag. But if Vanessa Fox Nude was ranking then that likely means that some signal was associating hibbpraqag with Google and that either they're using (at least in part) a really old index or the crawler they're using doesn't follow redirects very well.
vanessafoxnude.com has been redirecting to my current site for several years now, but back when the original site was active, much of the incoming anchor text was related to Google and search.
In that case, Google Webmaster tools is not actually reporting an error. That's a report to show you what URLs Google tried to crawl but couldn't (due to being blocked) so you can review it and ensure that you are not accidentally blocking URLs that you want to have indexed.
I agree that it's confusing in that the report is in the "crawl errors" section.
(I built Google webmaster tools so this confusion is entirely my fault; but I don't work at Google anymore so sadly I can't fix this.)