Forum: Ruby Ruby/Watchcat 1.0.0

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
8b4249ca3bb8c123da9f7aca63a652e1?d=identicon&s=25 Andre Nathan (Guest)
on 2006-04-22 01:05
(Received via mailing list)
Hello

I'm pleased to announce the release of Ruby/Watchcat 1.0.0.

Ruby/Watchcat is an extension for Ruby for the development of
watchcatd-aware applications.

Watchcatd is a watchdog-like daemon in the sense that it takes actions
in situations where a machine is under heavy load and/or unresponsive.
However, watchcatd isn't as drastic as the usual watchdog systems, which
reboot the machine. Instead, all it does is sending a signal to a
registered process (which by default is SIGKILL) if the process doesn't
send it a heartbeat before a user-specified timeout.

Ruby/Watchcatd allows you to register ruby applications with watchcatd.

Examples:

  require 'watchcat'

  # Create a new cat.
  cat = Watchcat.new(:timeout => 10, :signal => 'KILL',
                     :info => 'killing from ruby!')
  loop do
    # Here you do something that could exceed the timeout
    sleep 9 + rand(3)
    cat.heartbeat # we're still alive
  end
  cat.close # clean the cat's litter box

You can also use a block, in which case the cat cleans its own litter
box:

  require 'watchcat'

  Watchcat.new do |cat|
    loop do
      do_something_that_can_be_slow
      cat.heartbeat
    end
  end

For more details, please refer to the README file in the distribution
and
in the project's homepage at http://oss.digirati.com.br/ruby-watchcat/.

This is my first Ruby C extension, so I would greatly appreciate
comments
and suggestions :)

Best regards,
Andre Nathan
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-04-22 16:05
(Received via mailing list)
On Sat, 22 Apr 2006, Andre Nathan wrote:

> Hello
>
> I'm pleased to announce the release of Ruby/Watchcat 1.0.0.
>
<snip>

so, in effect, this is a Timeout::timeout based on a child process?

look handy for some of the work i'm doing now - which is a 24x7
satellite
ingest system which spawns many processes which can potentially hang...

-a
8b4249ca3bb8c123da9f7aca63a652e1?d=identicon&s=25 Andre Nathan (Guest)
on 2006-04-22 18:33
(Received via mailing list)
Hi Ara

On Sat, 2006-04-22 at 23:03 +0900, ara.t.howard@noaa.gov wrote:
> so, in effect, this is a Timeout::timeout based on a child process?

Yes, except that when the timeout expires, it triggers an action by the
watchcat daemon (which by default SIGKILLs the process).

> look handy for some of the work i'm doing now - which is a 24x7 satellite
> ingest system which spawns many processes which can potentially hang...

Our main use for it is in a shared hosting environment. We wrote a
mod_watchcat for apache2 and use watchcatd to kill misbehaving customer
scripts (that has helped increasing our servers' uptime a lot). When I
wrote the extension, my idea was to use it for something similar, maybe
with Mongrel, but I guess it would be useful for your satellite
application too :)

Regards,
Andre
8c43ed7f065406bf171c0f3eb32cf615?d=identicon&s=25 Zed Shaw (Guest)
on 2006-04-22 19:57
(Received via mailing list)
Hmmm, me thinking...


On 4/21/06 7:03 PM, "Andre Nathan" <andre@digirati.com.br> wrote:

> reboot the machine. Instead, all it does is sending a signal to a
> registered process (which by default is SIGKILL) if the process doesn't
> send it a heartbeat before a user-specified timeout.

With Mongrel you've got the situation that a single Mongrel server could
potentially be handling many requests at once, so killing off a dead one
could really make things bad.  But, in a shared hosting environment this
would perfect for catching the poorly coded servers that eat up
resources.

I kind of like this solution though since it is more difficult for the
person to cheat.  They can't really turn it off by injecting Ruby code
since
they still have to talk to the watchcat.  I'm curious if they could
cheat
other ways such as transferring the socket to another process which
always
works.

On another note, you know there's options to for throttling and
restricting
the number of active threads in Mongrel right?  -t will do a timeout
(says
seconds in the docs but it's actually 1/100th of a second) between each
socket accept.  -n will make sure the number of processor threads
doesn't go
above a given limit.

Zed A. Shaw
http://www.zedshaw.com/
http://mongrel.rubyforge.org/
8b4249ca3bb8c123da9f7aca63a652e1?d=identicon&s=25 Andre Nathan (Guest)
on 2006-04-22 20:50
(Received via mailing list)
Hi Zed

On Sun, 2006-04-23 at 02:53 +0900, Zed Shaw wrote:
> With Mongrel you've got the situation that a single Mongrel server could
> potentially be handling many requests at once, so killing off a dead one
> could really make things bad.  But, in a shared hosting environment this
> would perfect for catching the poorly coded servers that eat up resources.

Yes, the library is better suited for a multi-process model, like the
pre-forking MPM in apache2, because we can just kill the process that is
eating the resources without killing the whole server.

I'm actually not very familiar with Mongrel (just used it for a simple
Camping app), although I'm planning to when time permits. I'm guessing
that to make it work with Mongrel, one would need a wrapper script to
launch the server, so that it could be relaunched after it's killed.
Does that make sense?

> I kind of like this solution though since it is more difficult for the
> person to cheat.  They can't really turn it off by injecting Ruby code since
> they still have to talk to the watchcat.  I'm curious if they could cheat
> other ways such as transferring the socket to another process which always
> works.

In our environment, the user has no control of the watchcat (it is
created by mod_watchcat), so to pass the socket descriptor to another
process the user would first have to guess what the descriptor is. It's
not impossible to do it, but then it would be easy for us to identify a
user doing that.

Regards,
Andre
This topic is locked and can not be replied to.