Eric

Automated Spam…a Case Study

Est. Reading Time: 6 minutes

In all honesty, I planned this as a blog post on how ineffective automated spamming is. It’s unethical and annoying. It’s poorly targeted, rarely varies anchor text, and can’t get past a site that moderates comments (as all sites should). On top of that, Google recommends against it. In light of all these, it is amazing that anybody would even waste the time to set up a program to generate comment spam that would provide zero benefit.

To prove that auto-spam is a complete waste of effort, I went through a drew a random sample of 50 obviously automated spam comments that I have received to my personal blog over the past 6 months. I compared the sites that were being represented to the anchor text that the spam posts were trying to push and I then analyzed the rankings. The results were surprising, if not outright shocking. Some sites create and push this generic (and often offensive) spam…because it works.

Results (note: given the nature of many sites that spam, and the internet in general, some of the targeted anchor text is a bit risque, so please don’t click on the jump if that is something that you would prefer not to read):

There are a couple of notes that I would like to make before going into the data:

  • I’ve kept the domain names hidden to protect the guilty (and ensure that they do not get any props by being mentioned in a credible blog).
  • A few (and only a few) of these sites do have domain names that could help their rankings for these terms.
  • None of these domains has a PR greater than 1 and each has a link profile that is indicative of spam.  Most of the domains have been registered after the beginning of 2009.
  • None of the landing pages linked to have good content to represent the targeted phrases.
  • Link profiles for each domain are primarily comment spam similar to the items my site received.  Without good content or (apparent) quality/relevant inbound links – it does appear that the primary factor for these rankings is the targeted anchor text in the spam comments.
  • Rankings were obtained by using the RankChecker add on for Firefox.  RankChecker checks the top 200 results for each search engine.  For calculation purposes (averages and devations), I will be counting each non-result as position #201.

.

Data:

.

Results:

  • Out of the random sample of 50, Google ranked 23 sites with 8 first page rankings.  Bing ranked 20 sites with 4 first page rankings.
  • 13 phrases ranked in both search engines.
  • The 23 ranking sites in Google had an average rank of 20 with a standard deviation of 19.  The 20 ranking sites in Bing had an average rank of 40 with a standard deviation of 41.
  • Using non-rankings as value 201, the average Google rank for all 50 sites was 118 with a standard devision of 92.  The average Bing rank was 171 with a standard deviation of 83.

.

Conclusions (based upon this set of data):

  • To varying degrees, the automated spam “worked” with both Google and Bing.  Google recognized 46% of the sample, Bing indexed 40%.
  • Bing proved more successful with giving less credit to the spam than Google did (yes, Matt Cutts, I said that).
  • While a few of these phrases are longer tail with low search volume, the vast majority are high volume short tail phrases.  A site that ranks #10 for “xxx” with its 7,480,000 estimated monthly local searches (according to Google) is sure to produce quite a bit of traffic.

.

These results are very disappointing to me.  As I mentioned in the head, I expected this to be a post on the lack of effectiveness of comment spam.  This isn’t my fault – I moderate my site and spammy links don’t get through.  If you moderate your site, then it’s not your fault either.  It’s the fault of those who do not moderate and let these links steal their PR.  It is also  the fault of the search engines that give credit when and where this occurs.  Now, it’s unreasonable to expect the search engines to catch all of these instances, but 40-46% of a random sample getting through just is not acceptable.  Why should we expect automated spam to leave our sites alone when it is being rewarded for its efforts?

Big Idea:

Please note that I am not in any way, shape, or form advocating using automated spam.  I do not engage in that and will continue not to.  While it did show some results for these sites, I still believe it is a poor long term strategy.  Since they already represent a morally dubious industry (in most cases), these sites listed – actually not listed, but you catch my drift – have no fear of incurring a penatly, closing up shop, and moving to a new domain to start all over again.  Can your brand afford to do that?  Can you risk alienating potential customers and appearing amateurish with mass quantities of random comment spam?  I doubt it.  If you care about the long term welfare of your website and the reputation of your brand, then there is still a right way to do things.

Major Disclaimers:

  1. There may be other on-site factors that lead to a few of these sites ranking well for the targeted phrases.  Given the nature of most of these sites, my willingness to go through them at work is far below my desire to keep my job.
  2. Just because they have not been penalized by Google doesn’t mean that the penalty isn’t coming.  As noted, all of these links appeared in the last 6 months, so the efforts may still be too recent for the search engines to have fully caught on.  We have all seen sites where the penalty may be delayed, but basically destroys the domain once the hammer comes down.  Also of note, many of these sites use domain names that are almost jibberish.  My best guess is that these sites expect to be penalized sooner than later, but will simply close up shop, move elsewhere, and restart the cycle when they are finally caught.
  3. Due to a lack of time, I did not go through every link to these sites – so it is possible that they do have some quality link juice that is leading to these rankings.  Not likely based upon the samples that I did investigate, but possible.

Your thoughts?

- EW

Twitter: @ejwestksu