Clever Referer Spam

Update (2/26/06): Someone associated with the ‘nipple huggers’ site has written to complain about my accusations here. She also has left a couple of comments below. Just to be clear, there is no evidence that the site sends email spam, uses obtrusive popups, or installs spyware/adware, etc., on your computer. It appears simply that someone has attempted to optimize their position in search results by generating HTTP requests to other popular sites with their domain name in the referer field.


I used to have a big problem with “referer spam.” What is referer spam? My weblog lists “inbound links” on the right column so visitors can see who else has linked here. Since many weblogs provide a similar list, spammers began to create “spurious” inbound links so their URLs would appear in the right column of many weblogs, thus boosting their Google PageRank·. Usually, if you went back to the site that ostensibly linked to my weblog, it would be a porn or gambling site with no true links to my weblog.

This was easy enough to fix: I wrote a handmade filter that regularly checks all the putative inbound links and verifies that they do, in fact, link to my site.

Just today, I found my first instance of a spammer adaptation: the inbound link came from a site selling “nipple huggers” — some sort of jewelry that I don’t quite understand. I was curious how the site escaped my “referer check” script, so I checked it out. It turns out the “nipple hugger” site does link to my blog, with the link text “PopUp Scam – Click X to Close.” The linked page on my site has nothing to do with popup scams, but it is an interesting workaround to my filter. Rather than generating fake/spurious links, apparently real visitors to the “nipple hugger” site click on the link to my blog, and generate “real” referer links. Just today, I received inbound links from ten different hosts from the “nipple hugger” page.

I can’t think of any clever way to automatically filter these sorts of inbound links, because they really don’t look any different from genuine inbound links. At this point, I’m just inserting a keyword filter for known bad referers (just the “nipple hugger” at this point). Suggestions for more clever ways to escalate this arms race are welcome.

(I really hope my site doesn’t become a top search result for “nipple hugger” now. If it does, please, look elsewhere, I don’t even know what they are!)