Forum: Ruby on Rails my bug, nefarious scraper or a legitimate browser plugin?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
5aecb4f86cca29ff0419d764ff37281c?d=identicon&s=25 Jodi Showers (Guest)
on 2009-03-30 18:05
(Received via mailing list)
I've been faced the the following symptoms for some time.

I have links coded as :post or :put, so I can make sure that bots
aren't hitting particular links.

But it something is either hitting them as :get through an error I've
made (like link_to not working well in some browsers?), or there's 1
or more plugins that pre load urls; or I have scrapers.

Each day I'll get 50-100 error messages - where routes aren't found -
of this nature.

When I get many of these hits from the same ip address, I usually
assume a scraper, then block that ip address...but I don't want to do
this in very case if it's possible that a legimate (pre-loader browser
plugin) is causing this to happen.

Does anybody else this kind of behaviour? How do you handle it?

thanks.
Jodi
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2009-03-30 19:23
(Received via mailing list)
On Mar 30, 4:09 pm, Jodi Showers <j...@homestars.com> wrote:
> I've been faced the the following symptoms for some time.
>
> I have links coded as :post or :put, so I can make sure that bots  
> aren't hitting particular links.
>
> But it something is either hitting them as :get through an error I've  
> made (like link_to not working well in some browsers?), or there's 1  
> or more plugins that pre load urls; or I have scrapers.
>

A browser with js turned off would also do this (or using a firefox
plugin like noscript to only have it on for certain websites)

Fred
B09a3f6cdc4797532647d2d264b5df49?d=identicon&s=25 Jodi Showers (jshow)
on 2009-03-30 20:31
(Received via mailing list)
On 30-Mar-09, at 1:22 PM, Frederick Cheung wrote:

>>
>
> A browser with js turned off would also do this (or using a firefox
> plugin like noscript to only have it on for certain websites)
>
> Fred

ty Fred - yes. good thought - likely the simplest explanation.

will trap, then see where that leads me.

Jodi
B09a3f6cdc4797532647d2d264b5df49?d=identicon&s=25 Jodi Showers (jshow)
on 2009-04-01 16:48
(Received via mailing list)
This latest info rules out JS-off or a noscript plugin -

On 30-Mar-09, at 1:22 PM, Frederick Cheung wrote:
>>
>
> A browser with js turned off would also do this (or using a firefox
> plugin like noscript to only have it on for certain websites)
>
> Fred

Here's the markup where the link is:

<a style="margin-left: 0pt; padding-left: 0pt; font-size: 16px; margin-
bottom: 60px;" onclick="Windows.overlayHideEffectOptions = {duration:
3}; Dialog.info({url:
'http://homestars.com/messages/201895-contex-roofin...
, options: {method: 'post'}}, {title:'Contact Us', className:
'bluelighting', width: 290, height: 450, destroyOnClose: true,
draggable: true, evalScripts: true});; return false;" href="#"><img
style="margin-right: 5px;" alt="Email Contex Roofing Company Ltd."
src="/images/contact_email.jpg"/><img style="margin-right: 5px;"
alt="Phone Contex Roofing Company Ltd." src="/images/
contact_phone.jpg"/><span style="font-size: 14px;">Contact: Contex
Roofing Company Ltd.</span></a>

The bot/human is reaching the url
"http://homestars.com/messages/201895-contex-roofin...
", but you can see the href = '#' - so something is scanning the html,
looking for urls to harvest.

So looks like either a bot or a page preloader...

I don't mind pre-loaders - so I think I'll see if I can find a patter
in the plugins loaded..and if I don't find a plugin then this could
work as a honeypot.

I guess I don't have a specific question - merely symptoms - hopeful
that someone may have faced such a thing.

Jodi
This topic is locked and can not be replied to.