Spam Counterpunch Strategy

Over the last week or two, I’ve been honing my counterpunch for my blogspambombers, doing some research on other avenues to stem the flood, and figuring out how to combine the best of these efforts. There are some very nice solutions out there, including displaying a numeric image to would-be commenters that they can enter to validate that there are human eyes on the other side of the keyboard. Since most of my readers aren’t hardcore bloggers, and since I like making the user experience simple, I didn’t want to complicate the process of leaving comments on the site.

I know I’m probably tipping my hand on how I’m trying to block this junk, but it just seems appropriate to share this knowledge, both here and in the WordPress support fora. Some of this is either directly lifted from, or inspired by, the WP fora, or other sites. I’ll give credit where I can, and if I mess up the credit, just let me know.

  • From the WP support fora:

    RewriteEngine On
    RewriteCond %{HTTP_REFERER} "!^http://mysite.com/.*$" [NC]
    RewriteCond %{REQUEST_URI} ".*wp-comments-post.php$"
    RewriteRule .* - [F]

    This one seems like it would help some. Basically, this addition to httpd.conf (or .htaccess) will return a 403 if a request to wp-comments-post.php isn’t referred from some page from mysite.com — your comment script name and domain are probably different, so just make that change. It seems like a blogspambomber could easily change the HTTP_REFERER in their request, so that might not fix things for long, although their script would have to be pretty nimble to put the right thing in the HTTP_REFERER field. Also, you run the risk of boxing out real user clients that don’t send the HTTP_REFERER information.

    I think this change is probably a nice touch, but I would rather ban the IP address from hitting anything on the site if they are trying to hit the commenting scripts directly. My feeling is that if they’re doing something creepy by trying to hit my scripts, then there’s no telling what might be tried next! I didn’t implement this; read on…..

  • Also from the same thread on the WP support fora:
    There was a conversation point that if you changed the name of wp-comments-post.php to something else without leaving something named wp-comments-post.php behind to “answer the phone” when the blogspambombers hit, they would realize that the comment posting script name had changed, and they could simply parse through a comment form to get the name of the new comment posting script. In fact, that could even happen automagically with some not-too-difficult scripting. So, when I changed the file name to something else, I made sure that something was left behind for the blogspambombers to hit. What did I leave behind? Read on…
  • From a site somewhere:
    I read about a tripwire system to trip up badly behaving bots. I implemented this, and it works great to trap robots that don’t obey robots.txt, banning them after they stray from the happy path. I thought that would be a great addition to my WordPress blogspambomber arsenal. Here’s the code that lives in my wp-comments-post.php, mostly commented for your enjoyment:

    < !DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html>
    <head>
    <title>Banned for life!</title>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
    <style type="text/css">
    <!--
    body {
    font-family: Verdana, Arial, Helvetica, sans-serif;
    font-size: medium;
    color: #CCCCCC;
    background-color: #000000;
    }

    a {color: #CCCCCC;}
    a:hover {color: #FFFFFF;}
    a:active {color: #FFFFFF;}
    a:visited {color: #CCCCCC;}
    a:link {color: #CCCCCC;}
    -->
    </style>
    </head>

    <body>
    <p>You have triggered a trip-wire. This script exists solely to catch people doing
    things they shouldn't be, such as looking for administrative scripts like you were.</p>
    <p>As a result, your IP network address (

    < ?php $remote_addr = getenv("REMOTE_ADDR"); $remote_host = getenv("REMOTE_HOST"); $remote_agent = getenv("HTTP_USER_AGENT"); echo $remote_addr; ?>

    ) has been blocked from this
    entire site. You will no longer be able to browse domains here. In addition, the webmaster
    has been alerted to this activity and will be reviewing the records for possible
    action with your service provider.</p>
    <p>If you have stumbled here by accident, you can
    <a href="mailto:webmaster@mysite.com?subject=IP Ban for <?php echo $remote_addr ?>">send the webmaster an email</a>.
    Click the link and explain why you're reading this screen. Be sure to paste
    in the network address in parentheses above so that the webmaster can unblock you. If you
    don't e-mail the webmaster, you will <strong>NOT</strong> be able to get back to this screen
    again - you are <strong>BANNED</strong>.</p>

    < ?php

    # send an e-mail telling me about the banning

    $to = "banned@mysite.com";

    $subject = "[Alert] A ban has been triggered! (wp-comments-post.php)";

    $dateout = date ( 'Y-M-d @ G:i' );

    $message = "An address has been blocked from accessing mysite.com because it called wp-comments-post.php.\n\r";
    $message .= "\tDate\t\t$dateout\n\r";
    $message .= "\tIP address\t$remote_addr\n\r";
    $message .= "\tHostname\t$remote_host\n\r";
    $message .= "\tAgent\t\t$remote_agent\n\r";

    $headers = "From: ip_ban@mysite.comrn";
    $headers .= "Reply-To: ip_ban@mysite.comrn";

    mail($to, $subject, $message, $headers);

    ?>

    < ?php

    # add this ip to htaccess

    $htaccess = "/www/apache/htdocs/.htaccess";

    $htaccess_lines[1] = "\n#----- LOOKING FOR WP-COMMENTS-POST.PHP -----\n";
    $htaccess_lines[2] = "# $dateout $remote_agent\n";
    $htaccess_lines[3] = "SetEnvIf Remote_Addr ^$remote_addr$ denied \n";
    $htaccess_lines[4] = "#----- LOOKING FOR WP-COMMENTS-POST.PHP -----\n";

    # get current htaccess lines into array
    $htaccess_current = file($htaccess);

    # merge the arrays (prepending new lines)
    $htaccess_array = array_merge($htaccess_lines, $htaccess_current);

    # shove array into string
    $htaccess_output = "";
    $num_lines = count($htaccess_array);
    for ($i=0; $i< =$num_lines; $i++) { $htaccess_output .= $htaccess_array[$i]; } # open, lock, write, unlock, close $fp = fopen($htaccess, 'w+'); flock($fp, LOCK_EX); fwrite ($fp, $htaccess_output); flock($fp, LOCK_UN); fclose($fp); ?>

    If someone hits this script, they were looking for something they shouldn’t have been, so there’s no mercy. When the script is triggered, a nice informative message is sent back to the caller letting them know they hit the tripwire, and that they are banned. If it’s a bot driving the bus, then they won’t pay attention; if it’s a human, they’ll know they were being bad. Either way, their IP address is added with a deny to .htaccess and they are thrown 403’s from then on. It’s entertaining to watch the bot dip its toe in the water with a GET of some page, try a POST to wp-comments-post.php, and then get 403’s from then on. An e-mail is also sent to me to tell me of the ban. As I said before, I enjoy entertainment.

    How well does this work? Well, as I mentioned a few days ago, I had what appeared to be a zombie-net hitting my site. Had it been successful, or had I had to moderate all the comments, I could’ve been in for a lot of work. Aside from seeing 108 e-mails telling me of IP bans, the work on my side to recover was zero. And, the logs were fun to examine. Again, bizarre entertainment at the cost of zombie-bot-blogspambombers.

    As a side note, it was just cool to see the zombie-net move from IP to IP every few minutes when it figured out that I’d automagically banned the current one. What was neater was seeing my system respond well to that, blowing away the IPs, one by one.

  • Lastly, here’s the coup de gras for turning the sun-seared maginifying glass upon the ants on the spamhill. I was reading about having timeslots for comment posting on Internet Alchemy which had some of the concepts of an idea I had lobbed at Beck just a few nights ago. I figured I wasn’t the first person to think of changing the comment script name programmatically, and I was glad to see someone had already climbed that mountain.

    I also read on inf7.net about some scripting to programmatically make the changes that are needed to rename the wp-comments-post.php and wp-comments.php scripts and change the references to them in all the right places.

    So why not marry those ideas, and add a little spice to them? In fact, let’s make it so even I don’t know what the script names are!!! Here’s what we do.

    First, we need to add some stuff to the .htaccess file in the root of the WordPress installation (so, wherever your index.php and comment scripts live).

    I could just update .htaccess with the name of the real comments scripts, and send every other request for something that looks like the old comment scripts to the 403 bit-bucket. However, I want to ban their IP from the site, too, and just giving them perpetual 403’s when they hit the old scripts — but not for everything else on the site — just isn’t entertaining enough for me.

    So, the first part of the solution is to make the change to .htaccess as I described above. But, for the old script requests, I will direct them to script we wrote above using a RewriteRule. So, .htaccess gets:

    # if they ask for the right posting script, but didn't come through me, they get 403

    RewriteCond %{HTTP_REFERER} "!^/.*$" [NC]
    RewriteCond %{REQUEST_URI} ".*wp-dummy-comments-12345-post.php$"
    RewriteRule .* - [F,L]

    # if they ask for the right script, and came through me, then it's ok
    # if it's spam content, the spam filter can catch it

    RewriteCond %{HTTP_REFERER} "^/.*$" [NC]
    RewriteCond %{REQUEST_URI} ".*wp-dummy-comments-12345-post.php$"
    RewriteRule .* - [PT,L]

    # if they look like they're trolling about for the posting script, they get banned -- needs work for use against bots

    RewriteCond %{REQUEST_URI} "wp([-dummy]*)-comments(-[0-9]*)-post.php$" [NC]
    RewriteRule .* "/wp-comments-post.php" [R]

    That last rule will direct them to the script I included way above in this missive, and ban them if they are snooping too much or trying to guess the comment post script name. And that, my faithful readers, will allow the bad guys to be banned when they hit the old scripts, no matter whether the request is for a real old script, or one fairly recent.

    The next step is to create the script that will change the comment script names, incorporating something random, along with references to them.

    # get the old random number
    OLDRANDOM=`egrep "wp-dummy-comments-([0-9]*)-post.php" .htaccess | cut -d"-" -f4`

    NEWRANDOM=$RANDOM

    echo "Old " $OLDRANDOM " New " $NEWRANDOM

    # copy to the new filenames
    cp wp-dummy-comments-${OLDRANDOM}.php wp-dummy-comments-${NEWRANDOM}.php
    cp wp-dummy-comments-${OLDRANDOM}-post.php wp-dummy-comments-${NEWRANDOM}-post.php

    # change the php files
    replace 'wp-dummy-comments-'${OLDRANDOM} 'wp-dummy-comments-'${NEWRANDOM} -- *.php

    # change .htaccess
    replace 'wp-dummy-comments-'${OLDRANDOM} 'wp-dummy-comments-'${NEWRANDOM} -- .htaccess

    # remove the old copies
    rm wp-dummy-comments-${OLDRANDOM}.php
    rm wp-dummy-comments-${OLDRANDOM}-post.php

    We need to make sure that this script is executed every so often, so I stuck it in my crontab. How often to run it, thus changing the script names and references, is left as an exercise for the reader! 😉

So that’s how I’ve attacked the blogspambomber issue. YMMV, but for me, this seems to work pretty doggone well.