Fighting Spam In Drupal: Big Pipe to the rescue

Fighting Spam In Drupal: Big Pipe to the rescue

Difficulty: 
Piece of Cake

On a previous post I explained how we are using BigPipe in Drupal 7 (Cheap Pipe (sort of BigPipe) in Drupal 7). Besides all the known benefits of big pipe, there is a less known side effect that might help you fight spam.

Those out there fighting for conversion optimization know that captcha based form protection is a conversion killer (even with the new NoCaptcha from Google). That's why you should always be looking for ways to protect your web application from bots without real users noticing or their workflows being affected.

Even non-intrusive method such as Honeypot or Hidden Captcha are becoming less effective as bots get more sofisticated.

For us, what has been working very well historically against SPAM is using real e-mail verification when capturing cold leads.This basically consists in trying to deliver a fake message to the e-mail address submited to the form. Addresses used by SPAM bots usually do not exist or have been throttled by the e-mail provider. Of course, e-mail verification is the last stand after being able to go through a set of non intrusive form protection techinques (such as honeypot and/or hidden captcha).

A recent change in one of our customer's applications resulted in an increased number of fake leads being sent to Sendy that were able to pass e-mail verification as they were using valid and not throttled e-mail addresses. If you don't know what Sendy is, it's a self-hosted "simple Mailchimp" that can bring massive savings to e-mail based marketing.

 

So we decided to  force BigPipe on the block that renders the new lead capturing form, and SPAM submissions stopped inmediately. This makes some sense. Content that is rendered using BigPipe is loaded into the DOM through AJAX. I guess that - for performance reasons - most bots probably don't use a full emulated browser and rely on parsing/scanning the HTML response from the server to perform automatic form submissions. What used to be a plain rendered form, is now this to the eyes of the bot:

As a final word, these are the steps you need to follow to completely stop spam:

  • Try with ordinary/site building methods (honeypot, captcha, etc..)
  • Code some custom integration of those methods with your public forms
  • Implement application specific validations
  • Hire someone that knows what they are doing

Comments

Haha! Unexpected usage of BigPipe!

So clients with javascript are okay and those without javascript are spammers?
Then you might want to use a better javascript detection than using cheap pipe.
After all, that blob of javascript is just as easy to parse (and submit) for a bot as the original form.

0.2% of clients does not support Javascript. If you subtract the spammers from that number again, you probably end up with fewer actual users than Netscape or IE 5... You'd be better off testing in those browsers than work on supporting javascript-less users.

So clients with javascript are okay and those without javascript are spammers?

I can't read anywhwere that statement.

Then you might want to use a better javascript detection than using cheap pipe.

We can safely asume that everyone has Javascript. Trying to support users without it is like trying to keep pushing support for Flash Player, or trying to comply with accesibility standards. It just does not make business sense (unless it's a government project that needs to be certified).

After all, that blob of javascript is just as easy to parse (and submit) for a bot as the original form.

Of course! But bots, at least the ones annyoing us, are NOT doing so. If someone wants to SPAM you they will, it's just a matter of time and resources. What if I encoded that base64 (or anything else that is straightforward). It just takes 5 minutes of coding. If I'm the only one doing it, then unless the bot is specifically targeting my sites they won't bother. Of course, if they go for full browser emulation then this won't work anymore, but that makes their work much much slower.

Most bots are designed to deal with the most common use cases, so as long as you keep your application "away" from whatever everyone else is doing to protect themselves from SPAM you can get very nice results.

You would be surprised how effective a couple of lines of code making sure that a field that captures the name and/or surname of a person does not contain numbers is against SPAM.

We use another approach that works for the same reason but is cleaner for the end user. We ran across the idea somewhere else so it's not original to us, but just isn't mentioned here. Add a hidden date field to the form that has a value of the date/time when the form loaded. Then check the value submitted against the current date/time when the form gets submitted. If the difference in the time the form loads and is submitted is less than 5 seconds then it's a bot. If your form is long you can increase the number, if your form is very short you may need to increase it. This in combination with a honeypot field is very effective. The Honeypot module for Drupal has both of these methods available, although you may want to alter the form names/ids to make your site's form fields uniquely named.

Very interesting. Thank for sharing.

Add new comment

By: root Sunday, May 29, 2016 - 00:00