The Pace of Innovation
If you read any other blogs besides this one, you’ve probably heard that MovableType blogs are currently being hammered by a “new breed” of spambot. Webhosts are shutting down MT installations in self-defence. SixApart has offered soothing words that a fix is on the way.
As it turns out, the spammers in question had been visiting golem, upwards of a thousand times a day, until I … umh … made them go away. But to know that, you’d have to read my server logs. They had no noticeable effect on server load or on the amount of comment spam I’ve received (4 in the past two months).
Gleening what I can from the experience of others, what strikes me is how little innovation a spambot writer needs to exercise to get his 15 minutes of fame. This “new breed” of spambot differs from the ones in common usage a year ago in only two respects
- Taking a page from the crapflooders, it operates from behind multiple anonymous proxies. Thus it can deposit hundreds of spam comments in a short interval of time, while evading MT’s lame-ass, ineffective comment throttle1.
- Upon finding a comment-script to post to (by looking for the POST method of the comment-entry form), it cycles through the
BlogID
parameter, thus depositing spam on all of the blogs hosted by that MT installation.
Other than that, apparently, it’s no more sophisticated than the easily-defeated spambots of yesteryear. You don’t even need to do something mildly sophisticated to render it ineffective.
So why the brouhaha?
Because an unprotected MT installation will buckle under the load of a crapflood (and, apparently, because MT3.x is even worse than MT2.x in this regard). So this spambot’s author’s decision to go for quantity over quality has made for a lot of unhappy people, right now.
Trackback Spam (Update)
Lest anyone think I am totally sanguine about beating these spammers, let me hasten to say that I am not. Changing only a few lines of code in their spambot, they could convert to sending Trackback spam instead of Comment spam. And neither I, nor anyone else, would have any good way of defending against them. Sure trackback throttling (incorporated into MT 3.1) would stem the tide. But, unlike comments, which can be made difficult for machines to POST, trackbacks are supposed to be POSTed by machines.
My one thought on the matter was to demand that the URL of the trackback resolve to the same IP address as where the trackback ping came from, or at least to the same /24. Theoretically, the blogging software that produced the trackback ping is running on the same host as website in question. Trackback pings sent by spammers via anonymous proxies, however, would not match.
I was halfway through coding up a plugin for MT 3.1, when it occurred to me to wonder how my existing, legitimate, trackback pings would have faired under such a system. So I wrote a little script to comb through the trackback pings I’ve received, and retrospectively apply this test to them. The results were a bit disappointing:
Of 172 trackback pings,
- 65 matched IP addresses exactly
- 13 more were in the same /24
- 89 didn’t come even close to matching
- 5 DNS lookups failed
This exaggerates the problem somewhat. Some of these pings are over 2 years old. If the person changed webhosts in the interim, the IP addresses certainly wouldn’t match today, even if they did match when the trackback was originally POSTed. But still, it looks like this strategy would have blocked over half the legitimate trackbacks I’ve received.
Back to the drawing board…
Update 2: Endearing Moderation
Chad Everett has created a plugin that has similar functionality to my forced comment previews, without the need to hack the MT source-code. Actually, as far as I can understand, his plugin doesn’t actually force a preview so much as toss any non-previewed comments into a moderation queue. As I’ve explained, mine was primarily a way to enforce XHTML validation and only secondarily to deter comment spam. So I say, “Moderation, schmoderation! If it hasn’t been previewed/validated, you can’t post it.”
I actually got another spam comment tonight, which was of the endearing, hand-crafted sort that you see once you’ve eliminated the mechanized ones. The author came in on the following Google search
http://www.google.ca/search?q=.edu+mt-comments&hl=en&lr=&c2coff=1&start=60&sa=N
deciphered, as best he could, the entry in question, and the left the comment
I have a blog with XHTML 1.1 and I have tried employing your Mathenable feature and I keep getting the same error code default #F3423 substring expected could I be applying it incorrectly as maybe my lib/mt/app is a different library version.
Really delightful! Alas, his URL no longer points to some schlocky bargain-finds-on-ebay site….
1 If you want effective comment throttling, you need to install a plugin (a plugin!).
Re: The Pace of Innovation
The plugin solution only solves a small part of the problem… for MT3.X users. There is still a whole slew of MT2.X users out there, and these plugins are not backward compatible… but I think 6A has now learned that the problem is not version-specific.
The solution at the application level is going to have to involve a plethora of blog-specific changes. In other words, monoculture is our enemy, just as monoculture in the forest can wipe out an entire forest. Even among all MT users, we will need to find solutions that are not MT-community-wide because that is a weakness in our defense against the spambots. The solution will probably not be to just install a plugin and hope the problem goes away, because if the plugin is not forward and backward compatible, it’s only a bandaid and the spambots will eventually get around it.
My hope is that the folks at 6A realize that patching MT3.X is only a partial solution, and does nothing for those bloggers still running MT2.6XX. They may want to leave those people behind, but I don’t recommend that because it’s not smart business… the surest way to convince people to upgrade, and the best public relations act they could perform, would be to embrace those people still using the earlier versions – some of us do have legitimate reasons for still running the old software… in my case, it’s temporary, but the spambot problem is something I have to deal with today, especially if I want to keep my webhosts from shutting MT down on the server (and that’s my goal). We don’t want webhosts to start refusing MT installations, even those of the latest and greatest version, so we have to figure out solutions that will work for all MT users no matter what version they are using now or in the future.
My point, I guess, is that any fix that is application-wide will, likewise, be only a temporary fix until the spambots learn the new routine. What is needed on the application level is a fix that will somehow be unique to each blog, making the spambot people’s job a lot harder. For example, why does the comment script have to be exactly the same on every MT installation of a given version? Why does the base installation have to be the main blog (it isn’t on my setup)? And why not make the base installation of MT immune from googlebot via robots.txt file (installed automagically when MT is installed) at the base level of the MT installation) so that the “unique” comment script name in that installation is not particularly “advertised” to one and all?
Fixes on the server level, however, that cut across all application platforms are, imnsho, a better solution in the wider viewpoint, because they are not limited to specific applications or versions of those applications. They help us all, no matter what apps we are using. A good conversation on this is also ongoing over in the TextDrive forums. :)