## November 17, 2003

### Comment Spam II

No, I haven’t (yet) received any more since I took action.

But, as predicted, the spammers have become more diversified in their techniques, so it’s time to bring other webloggers up to date.

The spammers appear to be using two techniques currently:

1. Find the URL of a comment-entry script (e.g. mt-comments.cgi) on Google and post a comment directly to that script.
2. Find a weblog entry by following a link from blogdex or daypop or technorati or wherever. Look for a comment-entry form on that page, and submit the form.

My previous article dealt with defeating the first technique. Since writing it, 40 spambots have gotten their URL’s added to my ban-list. At first, they were coming at a rate of 3 or 4 per day, but that has dropped off as my (former) comment-entry script URL’s have slowly disappeared from Google’s index.

The second technique has proven a problem for others. But it hasn’t affected me. I have no idea whether spambots using it have attempted to access my comment form. Why? Because I don’t have a comment-entry form on my individual archive page. You need to follow a link to get to the comment-entry form.

While easy for humans, figuring out which link to follow to reach the comment form adds an extra layer of complexity to the spambots. And it makes them susceptible to “honeypot” forms (“To get your IP Address permanently banned from this site, enter a comment below…”), among other devious things.

I haven’t bothered setting up a honeypot yet. And there are several other tricky techniques I could yet deploy. But those are for a future post. Remember my motto:

Posted by distler at November 17, 2003 10:29 AM

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/250

### Re: Comment Spam II

last month I was visited by a human spammer.

Posted by: Sam Ruby on November 21, 2003 6:06 AM | Permalink | Reply to this

### Turing Test

And you’re sure it was a human, and not a 'bot?

There’s at least one 'bot making the rounds which leaves random messages of the form

I completely agree!

This is getting out of hand. You people need to get a life!

I just discovered your blog. It’s really interesting. Keep up the great work.

and so forth. The text is randomly chosen, sometimes even appearing to have something to do with the topic at hand (this may be by design, or it may be coincidental). The payload is the “author’s” URL link.