Forum: Mongrel Why not ignore stale PID files?

C98e88f3e69340d27466baadb2b80b4c?d=identicon&s=25 Gunnar Wolf (Guest)
on 2008-06-06 04:19
(Received via mailing list)
Hi,

I have an application which is dying horrible deaths
(i.e. segmentation faults) in mid-flight, in production... And of
course, I should fix it. But while I find and fix the bugs, I found
something I think should be different - I can work on submitting a
patch, as it is quite simple, but I might be losing something on my
rationale.

When Mongrel segfaults, it does not -obviously- get to clean up after
itself, so it does not remove the PID files. As an example:

$ sudo /etc/init.d/mongrel-cluster start
Starting mongrel-cluster: Starting all mongrel_clusters...
mongrel-cluster.
$ sudo cat tmp/pids/mongrel.8203.pid | xargs kill -9
$ sudo /etc/init.d/mongrel-cluster status
(...)
found pid_file: tmp/pids/mongrel.8203.pid
missing mongrel_rails: port 8203
(...)
$ sudo /etc/init.d/mongrel-cluster restart
Restarting mongrel-cluster: Restarting all mongrel_clusters...
** !!! PID file tmp/pids/mongrel.8203.pid already exists.  Mongrel could
be running already.  Check your log/mongrel.8203.log for errors.
** !!! Exiting with error.  You must stop mongrel and clear the .pid
before I'll attempt a start.
mongrel-cluster.

So, what's the solution? I must manually do:

$ sudo rm tmp/pids/mongrel.8203.pid
$ sudo /etc/init.d/mongrel-cluster restart

And now it works.

What should happen? Well, 'status' already found that there is a stale
PID. Of course, the 'status' action means exactly that: Get the
status, do nothing else. But the 'stop' action should clean the PIDs
if they do no longer exist, and the 'start' action should check
whether the process with that PID is alive, and ignore it if it's
not. At least, this behaviour should be specifiable via the
configuration file.

What do you think?

--
Gunnar Wolf - gwolf@iiec.unam.mx - (+52-55)5623-0154 / 1451-2244
PGP key 1024D/8BB527AF 2001-10-23
Fingerprint: 0C79 D2D1 2C4E 9CE4 5973  F800 D80E F35A 8BB5 27AF
339adb96fe66114b0f58566f14c8e609?d=identicon&s=25 Tikhon Bernstam (Guest)
on 2008-06-06 04:29
(Received via mailing list)
use the mongrel_cluster --clean option
8c43ed7f065406bf171c0f3eb32cf615?d=identicon&s=25 Zed A. Shaw (Guest)
on 2008-06-06 07:08
(Received via mailing list)
On Thu, 5 Jun 2008 16:08:06 -0500
Gunnar Wolf <gwolf@gwolf.org> wrote:

> What should happen? Well, 'status' already found that there is a stale
> PID. Of course, the 'status' action means exactly that: Get the
> status, do nothing else. But the 'stop' action should clean the PIDs
> if they do no longer exist, and the 'start' action should check
> whether the process with that PID is alive, and ignore it if it's
> not. At least, this behaviour should be specifiable via the
> configuration file.

That would be the ideal situation, but Ruby doesn't have good enough
process management APIs to do this portably.  To make it work you'd
have to portably be able to take a PID and see if there's a mongrel
running with that PID.

You can't use /proc or /sys because that's linux only.  You can't use
`ps` because the OSX morons changed everything, Solaris has different
format, etc.

If you were to do this, you'd have to dip into C code to pull it off.

Now, if you're only on linux then you could write yourself a small
little hack to the mongrel_rails script that did this with info out
of /proc.

--
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/
3d1717492b47e27038ccbc0c14533aa2?d=identicon&s=25 Erik Hetzner (Guest)
on 2008-06-06 20:08
(Received via mailing list)
_______________________________________________
Mongrel-users mailing list
Mongrel-users@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-users
022210bcab2575d7518b94f94b04df69?d=identicon&s=25 Istvan Szukacs (Guest)
on 2008-06-06 20:58
(Received via mailing list)
kill -0 `cat pid_file` >& /dev/null

more like

kill -0 $(<pid_file) >& /dev/null

regards,
Istvan
C98e88f3e69340d27466baadb2b80b4c?d=identicon&s=25 Gunnar Wolf (Guest)
on 2008-06-06 23:41
(Received via mailing list)
Zed A. Shaw dijo [Fri, Jun 06, 2008 at 01:01:32AM -0400]:
>
> Now, if you're only on linux then you could write yourself a small
> little hack to the mongrel_rails script that did this with info out
> of /proc.

Oh, silly me... I thought Ruby's Process class did with the
architectural incompatibilities... What I wrote to check for the
status is quite straightforward:

    ------------------------------------------------------------
    #!/usr/bin/ruby
    require 'yaml'
    confdir = '/etc/mongrel-cluster/sites-enabled'
    restart_cmd = '/etc/init.d/mongrel-cluster restart'
    needs_restart = false

    (Dir.open(confdir).entries - ['.', '..']).each do |site|
      conf = YAML.load_file "#{confdir}/#{site}"
      pid_location = [conf['cwd'],
      conf['pid_file']].join('/').gsub(/\.pid$/, '*.pid')
      pid_files = Dir.glob(pid_location)

      pid_files.each do |pidf|
        pid = File.read(pidf)
        begin
          Process.getpgid(pid.to_i)
        rescue Errno::ESRCH
          warn "Process #{pid} (cluster #{site}) is dead!"
          File.unlink pidf
          needs_restart = true
        end
      end
    end

    system(restart_cmd) if needs_restart
    ------------------------------------------------------------

(periodically run via cron)

I guess this works in any Unixy environment... I have no idea on
whether Windows implements something similar to Process.getpgid, or
for that matter, anything on Windows' process management.

Greetings,

--
Gunnar Wolf - gwolf@gwolf.org - (+52-55)5623-0154 / 1451-2244
PGP key 1024D/8BB527AF 2001-10-23
Fingerprint: 0C79 D2D1 2C4E 9CE4 5973  F800 D80E F35A 8BB5 27AF
C98e88f3e69340d27466baadb2b80b4c?d=identicon&s=25 Gunnar Wolf (Guest)
on 2008-06-06 23:41
(Received via mailing list)
Tikhon Bernstam dijo [Thu, Jun 05, 2008 at 07:29:22PM -0700]:
> use the mongrel_cluster --clean option

Very good addition to the overall logic, keeps things cleaner :-)

--
Gunnar Wolf - gwolf@gwolf.org - (+52-55)5623-0154 / 1451-2244
PGP key 1024D/8BB527AF 2001-10-23
Fingerprint: 0C79 D2D1 2C4E 9CE4 5973  F800 D80E F35A 8BB5 27AF
18813f71506ebad74179bf8c5a136696?d=identicon&s=25 Eric Wong (Guest)
on 2008-06-07 01:18
(Received via mailing list)
Gunnar Wolf <gwolf@gwolf.org> wrote:
> > If you were to do this, you'd have to dip into C code to pull it off.
> >

> I guess this works in any Unixy environment... I have no idea on
> whether Windows implements something similar to Process.getpgid, or
> for that matter, anything on Windows' process management.

Process.kill(0, pid) also works and is (in my experience) more
widely used.
This topic is locked and can not be replied to.