Shareware Beach

Friday, 21 January 2005

Comment Spammers Won’t Follow “No Follow”

Filed under: Cyberspace — Jan @ 20:34

Since starting this blog, email isn’t the only way I get spammed. Every day, comments are posted to my blog articles that aren’t genuine feedback, but ads for various kinds of dubious web sites. This is known as “comment spam”.

Now the major search engines Google, MSN and Yahoo are teaming up with blogging system developers such as Six Apart (TypePad, Movable Type and LiveJournal) and WordPress (which I use for Shareware Beach) to come up with a solution to eliminate comment spam. The idea is that blogging software will add a “nofollow” attribute to each link in comments or trackbacks. Search engines will then ignore such links.

They reason that comment spam exists because many search engines count inbound links to a web page to determine the relevancy of that page. The more relevant a page, the higher it appears in search results. The spammers are trying to cheat the search engines into giving their web pages higher rankings by getting more inbound links. Google’s inbound link algorithm is called “PageRank” after Google co-founder Larry Page. Many people are obsessed with PageRank, even though it’s only one of many variables Google uses.

Many bloggers have hailed this new initiative. I doubt they have given it much thought though, or if they even understand the reasons and implicatinos. I’m sure it will fail utterly, for several reasons.

1. Spamming takes little effort. The spammers have developed their comment spam bots already. The hard work has been done. Keeping the bots running costs next to nothing. If it doesn’t help them, it won’t hurt them.

2. Even if search engines ignore comment spam, there are still scores of humans reading all those comments. If humans didn’t read comments, bloggers wouldn’t publish them. And humans reading comments will click on their links, if the spammer words the comment cleverly enough. (Some comment spam is hard to distinguish from a real comment, even for a human. The only clue is that the comment is generic (“great post”, etc.), rather than elaborating the topic at hand. So comment spam will continue to bring traffic for the spammers, just like email spam does.

3. One of the reasons PageRank and similar algorithms work well is that webmasters have little influence over inbound links. Certainly not quality inbound links. (There’s more to PageRank than just the amount of links–which pages the links come from matters more.) A “nofollow” attribute destroys that, since it allows each webmaster to determine which links will “earn” PageRank, and which won’t. A webmaster could easily add the attribute to all outbound links that don’t point to his own sites or his friends’ sites.

4. When somebody post a comment to my blog that is insightful enough for me to allow it (comments are moderated), I want them to link to their own blog, and I want search engines to follow that link. It’s only fair if somebody makes a contribution, they are credited for it, however small the credit may be. Presumably, WordPress and other blog systems will enable bloggers to turn off the “nofollow” feature. That is exactly what I will do. Spammers will continue spamming, knowing that at least some of their links will be followed by the search engines.

Or, as The Register puts it: “Google’s No-Google tag blesses the Balkanized web

AskJeeves and Teoma don’t plan to be follow nofollow just yet. It’s good to have some diversity among search engines.

No Comments

No comments yet.

Sorry, the comment form is closed at this time.