Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

November 4, 2005

Spider Spamming

As Sam repeatedly reminds us, HTTP GETs should be idempotent.

Usually, it’s “granny” who must be protected from GETting some unsafe resource. In the case of blogs, however, it’s not granny you need to worry about. It’s search engine spiders.

sv-crawl.looksmart.com - - [03/Nov/2005:14:37:47 -0600] "GET /cgi-bin/MT-2.5/sxp-comments.pl?comments_form&static=1&entry_id=75&author=affiliate%20software&email=ciali@mail.com&url=http://www.partnersmanager.com&text=great%20site&bakecookie=1&post=%20POST HTTP/1.1" 302 571 "-" "Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com)"
sv-crawl.looksmart.com - - [03/Nov/2005:14:37:47 -0600] "GET /cgi-bin/MT-3.0/sxp-comments.pl?comments_form&static=1&entry_id=75&author=affiliate%2520software&email=ciali@mail.com&url=http://www.partnersmanager.com&text=great%2520site&bakecookie=1&post=%2520POST HTTP/1.1" 200 6139 "-" "Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com)"
sv-crawl.looksmart.com - - [03/Nov/2005:22:10:26 -0600] "GET /cgi-bin/MT-2.5/sxp-comments.pl?comments_form&static=1&entry_id=75&author=levitra&email=ciali@mail.com&url=http://www.one-levitra.com&text=great%20site&bakecookie=1&post=%20POST HTTP/1.1" 302 552 "-" "Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com)"
sv-crawl.looksmart.com - - [03/Nov/2005:22:10:26 -0600] "GET /cgi-bin/MT-3.0/sxp-comments.pl?comments_form&static=1&entry_id=75&author=levitra&email=ciali@mail.com&url=http://www.one-levitra.com&text=great%2520site&bakecookie=1&post=%2520POST HTTP/1.1" 200 6084 "-" "Mozilla/4.0 compatible ZyBorg/1.0 (wn-14.zyborg@looksmart.net; http://www.WISEnutbot.com)"

Does your blogging system accept comments via HTTP GET?

Update:

That’s sooo 2004. The actual spammers, I expect, have long since moved on. In this case, I think I can blame Phil’s new blogging software and its gratuitous autolinking of URLs.
Posted by distler at November 4, 2005 1:36 AM

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/673

4 Comments & 0 Trackbacks

Re: Phil Spamming

Bah. That sucks: if you don’t link something in my comments yourself, it should not be linked. I’ll get that fixed, um, sometime after twenty or thirty spiders have picked it up.

I’ll bet I’m serving up a bunch of totally broken code we’ve posted over the years, with curly quotes and numbers and punctuation converted to smilies, too. Time to look for anti-munging plugins.

Posted by: Phil Ringnalda on November 5, 2005 12:33 AM | Permalink | PGP Sig | Reply to this

Re: Phil Spamming

Okay, slightly quicker than that:

remove_filter('comment_text', 'make_clickable');

thrown into a random plugin file (actually, not-randomly, into my dofollow anti-nofollow plugin), and it’s fixed. Sorry for the spam, though I always like an excuse to look back at old posts like that.

Sheesh on a shingle.

That must have been from before I gave up on not being blacklisted by corporate proxies as a result of my language.

Posted by: Phil Ringnalda on November 5, 2005 12:51 AM | Permalink | PGP Sig | Reply to this

Re: Phil Spamming

Sorry for the spam, though I always like an excuse to look back at old posts like that.

Well, it wasn’t actual spam. because MT doesn’t accept comments via GET. Thanks for reminding me why that’s the case (and how frighteningly recently it wasn’t).

Posted by: Jacques Distler on November 5, 2005 8:19 AM | Permalink | PGP Sig | Reply to this

Re: Spider Spamming

I have seen some spider named scooter in my logs.

Which search engine spider is it..i am not able to figure it out..

Any help would be appreciated..

Posted by: Shayari on September 26, 2007 1:23 PM | Permalink | Reply to this

Post a New Comment