Long-standing bug in logger or monitor?

matthewmurcia · April 5, 2010, 12:44pm

Hi there,

I’ve spotted some bad behaviour in a program that I’ve boiled down to a
test case, and I think it demonstrates a bug in either logger or monitor
or my understanding of how either should work.

It runs 1000 threads, all logging through a single Logger at short
random intervals:

SNIP

#!/usr/bin/ruby

demonstrate bug in logger / monitor

require ‘logger’

threads = []
logger = Logger.new(STDOUT)

1000.times do
threads << Thread.new do
logger.info “Blah blah blah”
sleep rand(100.0)/1000.0
end
end

will print 1000 after a few seconds, no more log lines

loop do
p threads.inject(0) { |s,t| t.stop? ? s+1 : s }
sleep 2
end

SNIP

For me, all the logging threads stop after a few seconds of running, the
script simply printing “1000” over and over, which seems like an
unintended deadlock.

The logger implementation appears to try to synchronize threads, but
it doesn’t seem to work. Debugging my own threading problems is bad
enough, but before I try to debug anyone else’s I thought I’d share and
see if anyone had any quicker insights into this issue. For me, it is
happening every few hours in a program only running six threads.

This happens reliably under the versions in Debian/lenny, Ubuntu/karmic
and a stable 1.9.1 built just now on my laptop, specifically:

ruby 1.8.7 (2008-08-11 patchlevel 72) [x86_64-linux]
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
ruby 1.9.1p376 (2009-12-07 revision 26041) [i686-linux]

matthewmurcia · April 5, 2010, 4:15pm

On Apr 5, 6:43 am, Matthew B. [email protected] wrote:

#!/usr/bin/ruby
sleep rand(100.0)/1000.0
For me, all the logging threads stop after a few seconds of running, the
and a stable 1.9.1 built just now on my laptop, specifically:

ruby 1.8.7 (2008-08-11 patchlevel 72) [x86_64-linux]
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
ruby 1.9.1p376 (2009-12-07 revision 26041) [i686-linux]

–
Matthew

What were you expecting your code to do? As is, the problem that
snippet is supposed to demonstrate is unclear (at least to me): It
creates 1000 threads which log, then sleep; afterwards it continually
counts the number of threads you created that are sleeping or dead. So
I’d expect this to first print some number very close to (or exactly)
1000, then continually print 1000 after the logging has ended. What am
I missing?

matthewmurcia · April 5, 2010, 8:45pm

facepalm sorry. I thought I had enclosed my logging threads in a loop
{}, i.e. trying to log forever, and a deadlock was what I was hoping to
diagnose in a test case. I definitely see a long hang in my app when
two threads try to log at the same time; will post more details later
and try harder on my test case.

cheers,