AJAX comment spam


#1

Hi all…

I don’t know if anyone else has noticed this lately, but there has been
a lot
of spam on my blog as of late even though non-AJAX commenting is
disabled
and anti-spam is enabled.

Although the spam is completely useless – it doesn’t even contain a
link to
any site – I’m sure it’s only a matter of time until they figure out
how to
create links. :-/

There are a lot of trackback spams lately too. The worst thing is that
I have
30 on one article in under a minute, and they’re all from different IP
addresses (hence blacklisting is futile.)

TX


#2

You guys might want to look into this concept I created last year:

http://www.i-marco.nl/weblog/archive/2005/08/24/
trackback_spam_eliminated

I NEVER get trackback spam because of this.

  • Marco

#3

Huh. My impression with trackbacks was a lot of it was automated
through blog software (looking at links in your post and checking
them for trackback URLs). A javascript implementation like this would
completely break that functionality.


#4

I’ve never heard a distinction there. And in that case, what’s to
stop trackback-spams from using “pingback” instead?


#5

Nope, that’s pingback. Similar to trackback, but different.
Sending a trackback requires a manual action by the blogger who wants
to send one. Pingback however is done automatically.

  • Marco

#6

Trejkaz removed_email_address@domain.invalid writes:

- I receive an IM saying a new comment has been posted, which has a
  link to the admin interface for that article (I'm planning to patch
  this feature into my local branch already.)
- I go into the admin interface, and hit Delete.
- The delete confirmation page has a new button, "Delete and Block",
  which adds that IP onto the blocked list.

Hmm… I’ve been thinking along those lines too.

The only other thing we can do is raise the bar some more,
e.g. require OpenID authentication for all comments. But things
like that, a spammer can always work around. Unfortunately, I
really, really, really hate CAPTCHA setups, but that’s starting to
look like the only way to stop it.

You can get round CAPTCHAs too by re-serving the captcha images as
legitimate captchas on, say, your porn sites and feeding the punter’s
response back to the spammed site. Even if you miss the timeout 9
times out of 10, there’s always another punter.


#7

Piers C. wrote:

The only other thing we can do is raise the bar some more,
e.g. require OpenID authentication for all comments. But things
like that, a spammer can always work around. Unfortunately, I
really, really, really hate CAPTCHA setups, but that’s starting to
look like the only way to stop it.

You can get round CAPTCHAs too by re-serving the captcha images as
legitimate captchas on, say, your porn sites and feeding the punter’s
response back to the spammed site. Even if you miss the timeout 9
times out of 10, there’s always another punter.

I’m not sure I follow you, but how does this allow a spammer to decode
my CAPTCHA in order to successfully post a comment?

Ultimately it would surely come down to text recognition, and if the
CAPTCHA is good enough (or if it’s not even text, or does something
really unique) then that would make it much harder for a bot to get a
comment through.

Though of course, if they submit enough, then it comes down to
statistics, and eventually something will get through. But perhaps by
then, their IP has been auto-blacklisted.

I still don’t like CAPTCHAs though… at least not image-based ones.
Perhaps I can follow the math problem route, or do something really
unique. I remember one blog where it only asked you to enter a very
large number. :slight_smile:

TX


#8

Marco van Hylckama Vlieg wrote:

Nope, that’s pingback. Similar to trackback, but different.
Sending a trackback requires a manual action by the blogger who wants
to send one. Pingback however is done automatically.

Typo seems, at least on the surface, to consider the two to be exactly
the same animal.

In any case, I don’t think JavaScript is going to help in the long run.
As I said in my original message, the spambots have now figured out
how to submit blog posts even though I have non-AJAX commenting
disabled. So it isn’t like they’re afraid of a little JavaScript
anymore. We need something better.

A lot of this could probably be done via some more clever integration
between the admin UI and the spam blocking script.

e.g.:

- I receive an IM saying a new comment has been posted, which has a
  link to the admin interface for that article (I'm planning to 

patch
this feature into my local branch already.)
- I go into the admin interface, and hit Delete.
- The delete confirmation page has a new button, “Delete and Block”,
which adds that IP onto the blocked list.

The only other thing we can do is raise the bar some more, e.g. require
OpenID authentication for all comments. But things like that, a spammer
can always work around. Unfortunately, I really, really, really hate
CAPTCHA setups, but that’s starting to look like the only way to stop
it.

TX


#9

Getting the image doesn’t do much without the session ID. You should
destory the session anyway.

On 3/12/06, Kevin B. removed_email_address@domain.invalid wrote:

The spammer, who also runs a porn site, hits up your blog, sees your


Typo-list mailing list
removed_email_address@domain.invalid
http://rubyforge.org/mailman/listinfo/typo-list


Man Wit Da Plan.
http://d-jacobs.com


#10

On Mar 12, 2006, at 4:50 PM, Trejkaz wrote:

You can get round CAPTCHAs too by re-serving the captcha images as
legitimate captchas on, say, your porn sites and feeding the punter’s
response back to the spammed site. Even if you miss the timeout 9
times out of 10, there’s always another punter.

I’m not sure I follow you, but how does this allow a spammer to decode
my CAPTCHA in order to successfully post a comment?

The spammer, who also runs a porn site, hits up your blog, sees your
captcha, copies the image and re-serves it as the captcha for someone
visiting his porn site. That unknowing person successfully deciphers
the captcha, and the spammer takes the result and feeds it back to
the blog.


#11

Daejuan Jacobs wrote:

The spammer, who also runs a porn site, hits up your blog, sees your
captcha, copies the image and re-serves it as the captcha for someone
visiting his porn site. That unknowing person successfully deciphers
the captcha, and the spammer takes the result and feeds it back to
the blog.
Getting the image doesn’t do much without the session ID. You should
destory the session anyway.

I see. This is like using Google Answers, Yahoo Answers, or any given
clone thereof. All the bot has to do is make a call-out to some
abstract service which answers the question, and that service just
uploads the image to practically anywhere they can find someone to
decipher it.

A porn site could host the same service, but of course, this assumes the
porn site has enough traffic for there to be a user online who would be
willing to do this for free. As soon as you start paying someone money,
then it costs to spam, and that’s probably against most spammers’
ethics.

It would work though, assuming such a bored user exists. And I mean,
any user with more than 10,000 kills on The Kill Everyone Project
probably fits into this category. Gives me a neat idea for a new web
site which does nothing but feed the users images to decode. Of course,
I wouldn’t do it for cracking other CAPTCHAs, purely to see just how
bored users get. :wink:

TX


#12

I see what you’re saying, but if my server deletes the session after
you access the page to get the image (or timeout), than what you’re
tying to server me is invalid.

On 3/12/06, Kevin B. removed_email_address@domain.invalid wrote:

On Mar 12, 2006, at 4:50 PM, Trejkaz wrote:

http://www.tildesoft.com


Man Wit Da Plan.
http://d-jacobs.com


#13

Uhh, what? The spammer serves back the result in the same session
they got the captcha in the first place. This is an automated process
so it has the potential to be fast enough.


#14

Yes, that’s called a timeout. And Piers C. had it right when he said

Even if you miss the timeout 9 times out of 10, there’s always
another punter.

There’s no way for you to know, serverside, whether the access is by
a spammer or by a real user, so as long as the spammer gets an answer
to his captcha fast enough he can spam your blog with impunity.