Not all sigchld recieved

I tried to run this script

processes_count = 5

@rout_objects = []

trap('SIGCHLD') do
  pid = Process.wait(-1, Process::WNOHANG)
  puts pid.inspect

processes_count.times do |index|
  puts "creating new process #{index}"
  rout, wout = IO.pipe

  pid = fork {
    $stdout.reopen wout

    process_name = "Process ##{index}"
    Process.setproctitle process_name

    2.times do |i|
      puts "#{process_name}: Message ##{i}"
      sleep 1
    exit 0
  @rout_objects << rout

loop do
  out_ready, _, _ =, nil, nil)
  out_ready.each do |rout|
      puts rout.read_nonblock(100)
    rescue EOFError

it spawns 5 processes that print some lines and exit. The main process catches SIGCHLD signals to print children pids. But sometimes I see that not all SIGCHLD signals are trapped. Output example:

creating new process 0
creating new process 1
creating new process 2
creating new process 3
creating new process 4
Process #0: Message #0
Process #1: Message #0
Process #2: Message #0
Process #3: Message #0
Process #4: Message #0
Process #0: Message #1
Process #1: Message #1
Process #2: Message #1
Process #3: Message #1
Process #4: Message #1

Here we can see that only 4 SIGCHLD signals received and one is lost.
I’m using ruby 2.3.8 on Ubuntu 19.04. Why not all SIGCLD signals are catched?

I don’t want to use waitall() or waitpid(pid_of_child) because i do not want to block main process.

I was able to repeat your results on a quad-core processor. It looks like a race condition in your trap. I managed to get it to work correctly by changing to

trap('SIGCHLD') do
  pid = Process.wait(-1, Process::WUNTRACED)
  puts "Child #{pid} ended"

I also changed your delay to sleep rand(1..5) in the children just to add some variety, and it seems to work ok. Although that’s the problem with race conditions, you can never really prove that it’s fixed, only that it isn’t occurring any more… :roll_eyes:

Also, the final read loop is endless, although it doesn’t actually loop forever because it eventually blocks forever on the call once the child processes go away. Might be an idea to add a read timeout on the

1 Like

Thank you for your reply. Your answer helped me to find the solution. All signals are sended to the main process, but trap does not catch all of them, because they do not pushed in “queue”, each new signal overrides previous. It means that if SIGCHILD received, we can call waitpid more than 1 times to check all dead child processes. The SIGCHLD handler should be:

trap('SIGCHLD') do
  while pid = Process.wait(-1, Process::WNOHANG)  rescue nil
    puts pid

Nice solution. Makes sense too, if another signal arrives while the trap is already executing, how else could it work?