Any increase in spam — in e-mail, in Web searches — is annoying, and for years, Google has been all over spam, diligently working its algorithmic magic to minimize it in both areas. But even the best of algorithms may not be as sharp as the human eye, particularly if it's your own eye seeing junk links show up in your Web searches. That's why Google may consider letting each of us become our own spam fighters, to add to the efforts that the search giant already puts into it.
"Today, English-language spam in Google’s results is less than half what it was five years ago, and spam in most other languages is even lower than in English. However, we have seen a slight uptick of spam in recent months, and while we’ve already made progress, we have new efforts underway to continue to improve our search quality," wrote Google principal engineer, Matt Cutts, on Google's blog late last week.
Those new efforts might include you.
SearchEngineLand.com noted that Cutts posted remarks on Hacker News to that effect, in response to this question: "Can you speak about the possibility for personal domain blacklists for Google accounts? I know giving users the option to remove sites from their own search results is talked about a lot in these HN threads. Is there any talk internally about implementing something like this?"
Responded Cutts: "We've definitely discussed this. Our policy in search quality is not to pre-announce things before they launch. If we offer an experiment along those lines, I'll be among the first to show up here and let people know about it. :)"
As Cutts explained on Google's blog:
Webspam is junk you see in search results when websites try to cheat their way into higher positions in search results or otherwise violate search engine quality guidelines. A decade ago, the spam situation was so bad that search engines would regularly return off-topic webspam for many different searches. For the most part, Google has successfully beaten back that type of 'pure webspam' — even while some spammers resort to sneakier or even illegal tactics such as hacking websites. As we’ve increased both our size and freshness in recent months, we’ve naturally indexed a lot of good content and some spam as well. To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words — the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We’ve also radically improved our ability to detect hacked sites, which were a major source of spam in 2010. And we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content.We’ll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.
Google says Web spam "is the junk you see in search results when websites successfully cheat their way into higher positions in search results or otherwise violate search engine quality guidelines." It shares an example of it here.
And, as SearchEngineLand.com observed, as far as users doing their own spam-killing: "Google’s SearchWiki feature previously allowed users to do something similar, but SearchWiki edits were done at the page and keyword level; you could remove individual pages from the search results for certain keywords. Even though Google shut down SearchWiki last March, any results you removed while it was active are still preserved to this day in your Google account."
Cutts' comments on Hacker News sounds "more comprehensive," and "suggests that users could make domain blacklists that apply across-the-board to any keyword."