Stopping the spam
This weekend I was alerted to an issue with a site I created a while ago. The app was basically just a place where people can post stories and memories of a particular friend.
However, somebody noticed that a number of recent posts were spam. I took a look and they had titles such as the following:

Many dropped brand names in the title and body. The content of these posts consisted of meaningless sentences with random links and product names. It was seemed as if the spam was structured to look meaningful to crawlers. My theory is that they’re trying to leverage it for black hat SEO while trying to stay under the radar.
An additional point was that there weren’t very many posts. I’m not sure how long they’d been at it, but there were only about 10, maybe 1 coming in per hour. I started thinking about what exactly they were doing. A successful post requires a CRSF token, which would have to be given out my server to that particular form. It means they couldn’t spam it entirely externally, they would first have to request the form and get the authenticity token before they could make the POST request. I wonder if this slowed them down at all.
However, that’s no real resolution since it didn’t stop them. There were a couple options that came directly to mind. I could require users to sign up to post. However, that would be against one of the missions of the site, which was to allow users to leave stories fluidly with ease. And anonymity didn’t make a difference to me. Hence, I wasn’t going to force a login.
The main option that came to mind and is probably the most commonly used method to prevent bots was the captcha. Since this project was in rails, I started looking at various gems for captchas. The most popular tool seemed to be reCAPTCHA, which is maintained by Google. A lot of other options seemed to use RMagick to generate images based on your input which was quite cool.
However, I personally hate having to fill out captchas and often get them wrong the first try or two. Hence, I concluded on one of the simplest solutions. I added an additional field to the form asking a question for which the answer is blatantly obvious (and easy to figure out) for anyone who should be posting a story on this particular page.
It’s the same answer every time so it’s not very secure and could easily be added to a bot’s post request. However, to figure out the answer a spammer would physically need to look at the site to find the answer and add it to all future POST requests.
I have a feeling that there aren’t too many humans behind these spam posts, and even if there were it wouldn’t be worth their time. Simultaneously, adding the answer for a legit post is extremely easy. Enough to deter the bad guys, but an inconvenience so insignificant it won’t scare any real users away.
So far it appears to be successful as the spam has subsided!
Originally published at newtonry.tumblr.com.