Resolving pipe deadlock

musicdenotation · January 17, 2014, 9:19am

This is a cross post from SO:

I am trying to emulate UNIX command line pipes in a Ruby-only solution
that uses multiple cores. Eventually, the records piped from command to
command will be Ruby objects marshaled using msgpack. Unfortunately, the
code hangs after the first dump command. I am really trying to figure
out what causes this deadlock and how to resolve it.

Any hints appreciated.

Cheers,

Martin

himdel · January 18, 2014, 10:34am

If you feed it more than a couple of hundred lines of data the deadlock
happens.

Martin

himdel · January 17, 2014, 8:23pm

On 01/17/2014 12:19 AM, Martin H. wrote:

This is a cross post from SO:

I am trying to emulate UNIX command line pipes in a Ruby-only solution
that uses multiple cores. Eventually, the records piped from command to
command will be Ruby objects marshaled using msgpack. Unfortunately, the
code hangs after the first dump command. I am really trying to figure
out what causes this deadlock and how to resolve it.

This seems to work, without the parallel gem (sorry, not familiar with
it):

#!/usr/bin/env ruby

require ‘msgpack’
require ‘pp’

class Pipe
def initialize
@commands = []
end

def add(command, options = {})
@commands << Command.new(command, options)

 self

end

def run
writers = {}
@commands.each_cons(2) do |c_in, c_out|
reader, writer = IO.pipe

   c_out.input = MessagePack::Unpacker.new(reader)
   c_in.output = MessagePack::Packer.new(writer)
   writers[c_in] = writer
 end

 @commands.map do |command|
   fork do
     command.run
   end
   writers[command].close if writers[command]
 end

 Process.waitall

end

class Command
attr_accessor :input, :output

 def initialize(command, options)
   @command = command
   @options = options
   @input   = nil
   @output  = nil
 end

 def run
   send @command
 end

 def cat
   @input.each { |record| @output.write(record).flush } if @input

   File.open(@options[:input]) do |ios|
     ios.each { |record| @output.write(record).flush } if @output
   end
 end

 def dump
   @input.each do |record|
     puts record
     @output.write(record).flush if @output
   end
 end

end
end

p = Pipe.new
p.add(:cat, input: “foo.tab”).add(:dump).add(:cat, input:
“table.txt”).add(:dump)
p.run

himdel · January 19, 2014, 8:28am

Thanks Robert,

Your explanation fits more or less exactly what I was fearing.

I think it would be really nice for a understanding/debugging point of
view if I could get this working in a single thread (parallel can be
disable by setting in_processes: 0). However, there are several issues.

First: with a simple pipe: p.add(:cat, input: “foo.tab”).add(:dump)
where “foo.tab” only contains a few lines the script is hanging after
:dump and I suspect it is because of lack of a “termination signal” like
EOF.

Second, if “foo.tab” contains more than a couple of thousand lines then
it blocks in the :cat step - and I suspect that the reader buffer is
full and requires unloading to resolve the block.

Third, I was hoping that with multiple processes the next process would
unload the buffer from the preceding step in a timely fashion. However,
it is clear that this isn’t the case - some synchronization is require
between the processes. If only Ruby (MRI) threads supported multiple
processors it could be handled with a mutex (darn GVL!). One could dig
into EventMachine to see if that would work, but I am scared by the
complexity of if.

Thus, I fear my design is flawed. One thing I can think of is to change
the design slightly so a single record at a time is passed from command
to command.

Cheers,

Martin

himdel · January 18, 2014, 5:57pm

On Sat, Jan 18, 2014 at 10:34 AM, Martin H. [email protected]
wrote:

If you feed it more than a couple of hundred lines of data the deadlock
happens.

No knowledge of gem “parallels” here either but this phenomenon
happens also if one does not properly parallelize IO. It may be an
explanation. If it is what I am suggesting then everything works as
long as data size is less as some buffer size because then an IO
operation can finish even if there is no live reader. If you exceed
the limit then one IO operation blocks waiting for another to read
data from the buffer so it can write more data. If the other write
operation is somehow blocked by the first one (i.e. via some
synchronization mechanism or plainly because it is invoked from the
same thread afterwards) then the deadlock happens: the writer cannot
finish and the reader never gets a chance to read.

Kind regards

robert

himdel · January 19, 2014, 10:43am

I get this with strace, but I don’t really know what it means:

maasha@mao:~/scratch$ strace -cp 18264
Process 18264 attached - interrupt to quit
Process 18264 detached
% time seconds usecs/call calls errors syscall

-nan 0.000000 0 6 read
-nan 0.000000 0 23 1 write
-nan 0.000000 0 1 open
-nan 0.000000 0 4 close
-nan 0.000000 0 1 fstat
-nan 0.000000 0 1 mmap
-nan 0.000000 0 2 munmap
-nan 0.000000 0 3 rt_sigaction
-nan 0.000000 0 2 1 rt_sigreturn
-nan 0.000000 0 1 getrlimit
-nan 0.000000 0 1 sched_getaffinity
-nan 0.000000 0 1 tgkill

100.00 0.000000 46 2 total

himdel · January 19, 2014, 10:18am

Martin H. [email protected] wrote:

I am trying to emulate UNIX command line pipes in a Ruby-only solution
that uses multiple cores. Eventually, the records piped from command to
command will be Ruby objects marshaled using msgpack. Unfortunately, the
code hangs after the first dump command. I am really trying to figure
out what causes this deadlock and how to resolve it.

Hey, I’m not familiar with either parallel or msgpack, but can you try
using strace (or similar syscall tracer for your OS) on the spawned
processes? You’ll probably find one (or more) processes is stuck in the
read/write/select syscalls and which FDs they’re stuck on.

If you’re on Linux, perhaps check out dtas-graph, a Perl script I wrote
for visualizing the pipelines and dataflows within dtas-player[1]. I
combined it with strace for debugging, because it was sometimes
confusing to associate the FDs shown in strace output with the actual
pipes/connected process on the other end(s).

dtas-graph should work on any Linux processes using pipes, not just
dtas-player.

git clone git://80x24.org/dtas dtas

You’ll need the Graph::Easy perl module,

“apt-get install libgraph-easy-perl” if you’re on Debian,

I would’ve written it in Ruby but I couldn’t find a graphing

library in Ruby with ASCII/UTF-8 text output.

perl dtas/perl/dtas-graph $PID_OF_MAIN_PROCESS

Sample output: http://dtas.80x24.org/dtas-graph-sample.txt

Arrows denote direction of data flow, on the lines are the file
descriptor numbers associated with the pipe for each process. Boxes
either represent processes (identified via PIDs) or pipe objects
(identified via PIPEID:PIPE_INO mapping).

[1] - duct tape audio suite - http://dtas.80x24.org/

himdel · January 19, 2014, 11:47am

Martin H. [email protected] wrote:

Right (cryptic output IMHO):

You’ll have to figure that all out and what it means.

I’m just pointing you towards to some of the tools available. Leave the
processes running while you strace; run strace on each process in
multiple terminals, look at “ps axf” output, etc…

himdel · January 19, 2014, 10:58am

Martin H. [email protected] wrote:

I get this with strace, but I don’t really know what it means:

maasha@mao:~/scratch$ strace -cp 18264

Just use strace -p, no need for -c unless you’re doing performance
stuff. Try -f to follow forks, too.

himdel · January 19, 2014, 11:06am

Right (cryptic output IMHO):

maasha@mao:~/scratch$ strace -p 18272
Process 18272 attached - interrupt to quit
write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”, 31) = ? ERESTARTSYS
(To be restarted)
— SIGINT (Interrupt) @ 0 (0) —
write(4, “!”, 1) = 1
— SIGVTALRM (Virtual timer expired) @ 0 (0) —
rt_sigreturn(0x1a) = 1
rt_sigreturn(0x2) = -1 EINTR (Interrupted system
call)
close(9) = 0
open("/proc/self/maps", O_RDONLY) = 9
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) =
0
fstat(9, {st_mode=S_IFREG|0444, st_size=0, …}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7fd49ea35000
read(9, “7fd49d0d0000-7fd49d0df000 r-xp 0”…, 1024) = 1024
read(9, " /usr/local/lib/ruby/2"…, 1024) = 1024
read(9, "0000 08:01 260629 "…, 1024) = 1024
read(9, “400000 rw-p 00003000 08:01 26062”…, 1024) = 1024
read(9, “:00 0 \n7fd49e824000-7fd49e844000”…, 1024) = 1024
read(9, "fff64280000 rw-p 00000000 00:00 "…, 1024) = 231
close(9) = 0
munmap(0x7fd49ea35000, 4096) = 0
sched_getaffinity(18272, 32, {ffff, 0, 0, 0}) = 32
write(2, “./pipes.rb:50:in write'", 24) = 24 write(2, ": ", 2) = 2 write(2, "Interrupt", 9) = 9 write(2, "\n", 1) = 1 write(2, "\tfrom ./pipes.rb:50:inflush’\n”, 31) = 31
write(2, “\tfrom ./pipes.rb:50:in block (2"..., 49) = 49 write(2, "\tfrom ./pipes.rb:50:ineach’\n”, 30) = 30
write(2, “\tfrom ./pipes.rb:50:in block in"..., 38) = 38 write(2, "\tfrom ./pipes.rb:49:inopen’\n”, 30) = 30
write(2, “\tfrom ./pipes.rb:49:in cat'\n", 29) = 29 write(2, "\tfrom ./pipes.rb:43:inrun’\n”, 29) = 29
write(2, “\tfrom ./pipes.rb:27:in block in"..., 38) = 38 write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 103) = 103 write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 87) = 87 write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 98) = 98 write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 94) = 94 write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 86) = 86 write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 86) = 86 write(2, "\tfrom ./pipes.rb:27:inrun’\n”, 29) = 29
write(2, “\tfrom ./pipes.rb:67:in `’\n”, 32) = 32
rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER|SA_SIGINFO,
0x7fd49e617030}, {0x7fd49eb2d6f0, [], SA_RESTORER|SA_SIGINFO,
0x7fd49e617030}, 8) = 0
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER|SA_SIGINFO,
0x7fd49e617030}, {SIG_IGN, [], SA_RESTORER|SA_SIGINFO, 0x7fd49e617030},
8) = 0
close(7) = 0
close(8) = 0
write(4, “!”, 1) = 1
munmap(0x7fd49e926000, 1052672) = 0
rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTORER|SA_RESTART,
0x7fd49d9eb4f0}, {SIG_DFL, [], SA_RESTORER|SA_SIGINFO, 0x7fd49e617030},
8) = 0
tgkill(18272, 18272, SIGINT) = 0
— SIGINT (Interrupt) @ 0 (0) —
Process 18272 detached

maasha@mao:~/scratch$ strace -fp 18277
Process 18277 attached with 2 threads - interrupt to quit
[pid 18278] restart_syscall(<… resuming interrupted call …>
<unfinished …>
[pid 18277] write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”, 31) = ?
ERESTARTSYS (To be restarted)
[pid 18277] — SIGINT (Interrupt) @ 0 (0) —
[pid 18277] write(4, “!”, 1 <unfinished …>
[pid 18278] <… restart_syscall resumed> ) = 1
[pid 18277] <… write resumed> ) = 1
[pid 18278] read(3, <unfinished …>
[pid 18277] rt_sigreturn(0x2 <unfinished …>
[pid 18278] <… read resumed> “!”, 1024) = 1
[pid 18277] <… rt_sigreturn resumed> ) = -1 EINTR (Interrupted system
call)
[pid 18278] read(3, <unfinished …>
[pid 18277] write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”, 31
<unfinished …>
[pid 18278] <… read resumed> 0x7f2fc6d28020, 1024) = -1 EAGAIN
(Resource temporarily unavailable)
[pid 18278] read(5, 0x7f2fc6d28020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 18278] tgkill(18277, 18277, SIGVTALRM) = 0
[pid 18278] poll([{fd=3, events=POLLIN}], 1, 100 <unfinished …>
[pid 18277] <… write resumed> ) = ? ERESTARTSYS (To be
restarted)
[pid 18277] — SIGVTALRM (Virtual timer expired) @ 0 (0) —
[pid 18277] rt_sigreturn(0x1a) = -1 EINTR (Interrupted system
call)
[pid 18277] close(9) = 0
[pid 18277] open("/proc/self/maps", O_RDONLY) = 9
[pid 18277] getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024,
rlim_max=RLIM_INFINITY}) = 0
[pid 18277] fstat(9, {st_mode=S_IFREG|0444, st_size=0, …}) = 0
[pid 18277] mmap(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2fc687f000
[pid 18277] read(9, “7f2fc4f1a000-7f2fc4f29000 r-xp 0”…, 1024) = 1024
[pid 18277] read(9, " /usr/local/lib/ruby/2"…, 1024) = 1024
[pid 18277] read(9, "0000 08:01 260629 "…, 1024) = 1024
[pid 18277] read(9, “24a000 rw-p 00003000 08:01 26062”…, 1024) = 1024
[pid 18277] read(9, “:00 0 \n7f2fc666e000-7f2fc668e000”…, 1024) = 1024
[pid 18277] read(9, "fff1d5fb000 rw-p 00000000 00:00 "…, 1024) = 231
[pid 18277] close(9) = 0
[pid 18277] munmap(0x7f2fc687f000, 4096) = 0
[pid 18277] sched_getaffinity(18277, 32, {ffff, 0, 0, 0}) = 32
[pid 18277] write(2, “./pipes.rb:50:in write'", 24) = 24 [pid 18277] write(2, ": ", 2) = 2 [pid 18277] write(2, "Interrupt", 9) = 9 [pid 18277] write(2, "\n", 1) = 1 [pid 18277] write(2, "\tfrom ./pipes.rb:50:inflush’\n”, 31) = 31
[pid 18277] write(2, “\tfrom ./pipes.rb:50:in block (2"..., 49) = 49 [pid 18277] write(2, "\tfrom ./pipes.rb:50:ineach’\n”, 30) = 30
[pid 18277] write(2, “\tfrom ./pipes.rb:50:in block in"..., 38) = 38 [pid 18277] write(2, "\tfrom ./pipes.rb:49:inopen’\n”, 30) = 30
[pid 18277] write(2, “\tfrom ./pipes.rb:49:in cat'\n", 29) = 29 [pid 18277] write(2, "\tfrom ./pipes.rb:43:inrun’\n”, 29) = 29
[pid 18277] write(2, “\tfrom ./pipes.rb:27:in block in"..., 38) = 38 [pid 18277] write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 103) = 103 [pid 18277] write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 87) = 87 [pid 18277] write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 98) = 98 [pid 18277] write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 94) = 94 [pid 18277] write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 86) = 86 [pid 18277] write(2, "\tfrom /usr/local/lib/ruby/gems/2"..., 86) = 86 [pid 18277] write(2, "\tfrom ./pipes.rb:27:inrun’\n”, 29) = 29
[pid 18277] write(2, “\tfrom ./pipes.rb:67:in `’\n”, 32) = 32
[pid 18277] rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER|SA_SIGINFO,
0x7f2fc6461030}, {0x7f2fc69776f0, [], SA_RESTORER|SA_SIGINFO,
0x7f2fc6461030}, 8) = 0
[pid 18277] rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER|SA_SIGINFO,
0x7f2fc6461030}, {SIG_IGN, [], SA_RESTORER|SA_SIGINFO, 0x7f2fc6461030},
8) = 0
[pid 18277] close(7) = 0
[pid 18277] close(8) = 0
[pid 18277] write(4, “!”, 1 <unfinished …>
[pid 18278] <… poll resumed> ) = 1 ([{fd=3, revents=POLLIN}])
[pid 18277] <… write resumed> ) = 1
[pid 18278] read(3, <unfinished …>
[pid 18277] futex(0x7f2fc688a9d0, FUTEX_WAIT, 18278, NULL <unfinished
…>
[pid 18278] <… read resumed> “!”, 1024) = 1
[pid 18278] read(3, 0x7f2fc6d28020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 18278] read(5, 0x7f2fc6d28020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 18278] _exit(0) = ?
Process 18278 detached
<… futex resumed> ) = 0
munmap(0x7f2fc6770000, 1052672) = 0
rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTORER|SA_RESTART,
0x7f2fc58354f0}, {SIG_DFL, [], SA_RESTORER|SA_SIGINFO, 0x7f2fc6461030},
8) = 0
tgkill(18277, 18277, SIGINT) = 0
— SIGINT (Interrupt) @ 0 (0) —
Process 18277 detached

himdel · January 19, 2014, 1:55pm

OK, the output of strace -o trace -ff -ttt ./pipes.rb produces two
files:

ftp://ftp_20140119_17561:[email protected]

I have yet to figure out what they mean.

Cheers,

Martin

himdel · January 19, 2014, 12:40pm

On Sun, Jan 19, 2014 at 11:06 AM, Martin H. [email protected]
wrote:

Right (cryptic output IMHO):

maasha@mao:~/scratch$ strace -p 18272
Process 18272 attached - interrupt to quit
write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”, 31) = ? ERESTARTSYS
(To be restarted)

A lot is missing before that. To find out what object a file
descriptor points to it’s usually better to start the process with
“strace -f” and not attach later. Because if you attach later then
the open() calls are usually not contained. I would try something
like

$ strace -o trace -ff -ttt your/ruby/process with args

This will write one trace file per process called “trace.” and
follow forks (-o trace and -ff) and print timestamps with microsecond
resolution (-ttt). I suggest the latter over -r because then you can
align events from different processes if need be.

It seems the trace mostly contains reaction to you killing the process
with SIGINT (similar for the other trace):

fstat(9, {st_mode=S_IFREG|0444, st_size=0, …}) = 0
sched_getaffinity(18272, 32, {ffff, 0, 0, 0}) = 32
write(2, “\tfrom ./pipes.rb:43:in `run’\n”, 29) = 29
0x7fd49e617030}, {0x7fd49eb2d6f0, [], SA_RESTORER|SA_SIGINFO,
8) = 0
tgkill(18272, 18272, SIGINT) = 0
— SIGINT (Interrupt) @ 0 (0) —
Process 18272 detached

[pid 18278] read(5, 0x7f2fc6d28020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)

Maybe lines above give an indication that the reader is actually
blocked because there is no data. (Ruby MRI internally uses
nonblocking IO to multiplex threads.)

Kind regards

robert

himdel · January 19, 2014, 2:57pm

On Sun, Jan 19, 2014 at 1:55 PM, Martin H. [email protected]
wrote:

OK, the output of strace -o trace -ff -ttt ./pipes.rb produces two
files:

ftp://ftp_20140119_17561:[email protected]

Thank you!

I have yet to figure out what they mean.

I can share a few observations:

The parent creates three pipes:
$ fgrep pipe2 trace.1*
trace.18386:1390135219.046131 pipe2([3, 4], O_CLOEXEC) = 0
trace.18386:1390135219.046295 pipe2([5, 6], O_CLOEXEC) = 0
trace.18386:1390135219.112926 pipe2([7, 8], O_CLOEXEC) = 0

The client reads only from FD 3 and 5:

$ fgrep read trace.18387
1390135375.259220 read(3, “!”, 1024) = 1
1390135375.259410 read(3, 0x7fb63c823020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)
1390135375.259663 read(5, 0x7fb63c823020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)
1390135375.261880 read(3, “!”, 1024) = 1
1390135375.261910 read(3, 0x7fb63c823020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)
1390135375.261932 read(5, 0x7fb63c823020, 1024) = -1 EAGAIN (Resource
temporarily unavailable)

The parent writes to 2 (stderr) and 4 and 8
$ egrep -o ‘write([0-9]+’ trace.18386 | sort -u
write(2
write(4
write(8

But to 4 only at the end
$ fgrep ‘write(4’ trace.18386
1390135375.258855 write(4, “!”, 1) = 1
1390135375.261804 write(4, “!”, 1) = 1

Main output from file “big.tab” goes to 8:

$ fgrep -c ‘write(8’ trace.18386
1924

Now, as you can see 8 is connected to 7 but since the child never
reads from 7 no data is transferred to it.

If you look at the ordering of pipe2 and clone you see that the client
is created before the last pipe is opened:

$ egrep ‘^[0-9]+.[0-9]+ (pipe|clone)’ trace.18386
1390135219.046131 pipe2([3, 4], O_CLOEXEC) = 0
1390135219.046295 pipe2([5, 6], O_CLOEXEC) = 0
1390135219.046513 clone(child_stack=0x7fb63c384ff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7fb63c3859d0, tls=0x7fb63c385700,
child_tidptr=0x7fb63c3859d0) = 18387
1390135219.112926 pipe2([7, 8], O_CLOEXEC) = 0

So maybe it cannot read from FD 7 which it would need to do to read
what the parent writes to FD 8. But see at the end.

The strange thing is that almost all write operations are successful
because as many bytes are written as indicated:

$ egrep -c ‘write(8,.* ([0-9]+)) = \1’ trace.18386
1922
$ egrep -c ‘write(8,.* ([0-9]+)) =’ trace.18386
1924

$ diff -U5 <(egrep ‘write(8,.* ([0-9]+)) =’ trace.18386) <(egrep
‘write(8,.* ([0-9]+)) = \1’ trace.18386)
— /dev/fd/63 2014-01-19 14:33:25.426820557 +0100
+++ /dev/fd/62 2014-01-19 14:33:25.426820557 +0100
@@ -1918,7 +1918,5 @@
1390135219.160905 write(8, “\332\0
4559\t4602\t3_cpVRyxwXsN1/2\t592”…, 35) = 35
1390135219.160929 write(8, “\332\0
4563\t4597\t5_X4035ywXsN1/1\t833”…, 35) = 35
1390135219.160954 write(8, “\332\0
4565\t4599\t1_bWkzlxwXsN1/1\t442”…, 35) = 35
1390135219.160980 write(8, “\332\0
4568\t4611\t1_TQZWsxwXsN1/2\t540”…, 35) = 35
1390135219.161004 write(8, “\332\0
4570\t4613\t3_gQoPtxwXsN1/2\t196”…, 35) = 35
-1390135219.161029 write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”,
31) = ? ERESTARTSYS (To be restarted)
-1390135375.259425 write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”,
31) = ? ERESTARTSYS (To be restarted)

The last two ones happen because of the signal I believe:

1390135219.161029 write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”,
31) = ? ERESTARTSYS (To be restarted)
1390135375.258745 — SIGINT (Interrupt) @ 0 (0) —
1390135375.258855 write(4, “!”, 1) = 1
1390135375.259234 rt_sigreturn(0x2) = -1 EINTR (Interrupted system
call)
1390135375.259425 write(8, “\2764571\t4605\t5_GQbZ3ywXsN1/1\t7\t-\n”,
31) = ? ERESTARTSYS (To be restarted)
1390135375.259784 — SIGVTALRM (Virtual timer expired) @ 0 (0) —
1390135375.259836 rt_sigreturn(0x1a) = -1 EINTR (Interrupted system
call)

All in all 65266 bytes are written:

$ awk ‘/write(8,.* = [0-9]+/ {sum+=$NF} END{print sum}’ trace.18386
65266

This is 270 bytes short of 64k. I have no idea whether you happened to
kill the parent before it got into the stuck state. You could check
your pipe buffer size with “ulimit -p”. On linux this gives buffer
size in 512 byte blocks. On my machine it returns 8 which is 4k which
also happens to be the memory page size but of course it may be
different on your system.

Maybe you repeat the test and wait a bit longer to see whether the
parent gets into some kind of blocking state.

I looked a bit closer at flags passed to clone() and it seems the
child is a thread only:

1390135219.046513 clone(child_stack=0x7fb63c384ff0,
flags=CLONE_VM (same memory space)
|CLONE_FS (same filesystem information)
|CLONE_FILES (share the same file descriptor table.)
|CLONE_SIGHAND (share the same table of signal handlers)
|CLONE_THREAD (same thread group as the calling process)
|CLONE_SYSVSEM (share a single list of System V semaphore undo values
|CLONE_SETTLS (The newtls argument is the new TLS (Thread Local
Storage) descriptor.)
|CLONE_PARENT_SETTID (Store child thread ID at location ptid in parent
and child memory)
|CLONE_CHILD_CLEARTID (Erase child thread ID at location ctid in
child memory when the child exits,)

So, since both share the same FD table the client could read from FD
7 but apparently nobody does. So apparently you have a pipe which is
written to but nobody reads from it. At some point in time the writer
then must block since buffers are limited - always.

Kind regards

robert

himdel · January 19, 2014, 6:47pm

Thanks Robert,

As usual your assistance and insight is awesome.

I did let the strace run for a while before, and tested a couple of new
ones that was run for a minute or so, but since the traces contain the
same number of lines, I don’t think run time was an issue.

I don’t understand why 3 pipes (pipe pairs?) are reported, when IO.pipe
is called only once and fork is disabled (also disabling msgpack doesn’t
change the number of pipes).

“ulimit -p” on my linux system yields 8 and that indicates a 64K buffer.
On my Mac it yields 1?

Anyways, I did some more tests/straces with Parallel enabled:

gist.github.com

https://gist.github.com/maasha/5c6a914df0a79657714a

pipes.rb

#!/usr/bin/env ruby

require 'parallel'
require 'msgpack'
require 'pp'

class Pipe
  def initialize
    @commands = []
  end

This file has been truncated. show original

Since this version uses forks it means that a separate process is
reading from the process that is writing (that’s my idea at least). This
works to the degree that all of the data is being dumped, but then the
script hangs - I suspect that the pipe is not closed and freed
correctly. I uploaded the traces from this test to trace_2step.tar to
ftp://ftp_20140119_17561:[email protected]

Moreover, adding a further step in line 67 in the above gist:

p.add(:cat, input: “big.tab”).add(:cat, input: “table.txt”).add(:dump)

and it hangs after dumping big.tab. Again, I suspect bad pipe hangling
and I uploaded the traces from that one as well as trace_3step.tar. In
fact I uploaded code, test files and traces.

Cheers,

Martin

himdel · January 20, 2014, 10:14am

On Jan 19, 2014, at 1:17, Eric W. [email protected] wrote:

I would’ve written it in Ruby but I couldn’t find a graphing

library in Ruby with ASCII/UTF-8 text output.

If I’m allowed to use aalib, I can have one for you in about 10 more
lines of code.

himdel · January 20, 2014, 8:49am

On Sun, Jan 19, 2014 at 6:47 PM, Martin H. [email protected]
wrote:

Thanks Robert,

As usual your assistance and insight is awesome.

You’re welcome!

I did let the strace run for a while before, and tested a couple of new
ones that was run for a minute or so, but since the traces contain the
same number of lines, I don’t think run time was an issue.

I don’t understand why 3 pipes (pipe pairs?) are reported, when IO.pipe
is called only once and fork is disabled (also disabling msgpack doesn’t
change the number of pipes).

“ulimit -p” on my linux system yields 8 and that indicates a 64K buffer.
On my Mac it yields 1?

Does 8 really mean 64k? I would have assumed that all flavors of
Linux use 512 byte blocks which would mean 4k - same as on my box.

Moreover, adding a further step in line 67 in the above gist:

p.add(:cat, input: “big.tab”).add(:cat, input: “table.txt”).add(:dump)

and it hangs after dumping big.tab. Again, I suspect bad pipe hangling
and I uploaded the traces from that one as well as trace_3step.tar. In
fact I uploaded code, test files and traces.

I don’t have time right now to dive into yet another trace but I did
look at documentation of parallel. As far as I can tell from brief
inspection this is implementing a farmer worker scenario, i.e. you
have a queue of work items and processors (thread or process) pick
them up one after the other and execute. But if I understand what you
are doing properly this is not what you want. Basically you want all
tasks to run in parallel and the work items are taken from somewhere
else (i.e. the task at the head of the FIFO pipeline reads them from a
file) and piped through. In other words: it seems you are not using
the proper framework for your problem at hand.

Kind regards

robert

himdel · January 20, 2014, 1:55pm

On Sun, Jan 19, 2014 at 6:47 PM, Martin H. [email protected]
wrote:

and it hangs after dumping big.tab. Again, I suspect bad pipe hangling
and I uploaded the traces from that one as well as trace_3step.tar. In
fact I uploaded code, test files and traces.

One more potential source of issues: you open all the pipes from the
main process. However, child processes will inherit those pipe
objects and if they are not closed there they will keep the pipeline
alive. That might be the issue you face.

Cheers

robert

himdel · January 20, 2014, 11:25am

Ryan D. [email protected] wrote:

On Jan 19, 2014, at 1:17, Eric W. [email protected] wrote:

I would’ve written it in Ruby but I couldn’t find a graphing

library in Ruby with ASCII/UTF-8 text output.

If I’m allowed to use aalib, I can have one for you in about 10 more
lines of code.

Sure, I’d try it!

himdel · January 20, 2014, 8:37pm

@Robert, that is of cause something I need to look into. I did notice
that IO.pipe have a child/parent relationship from the docs, but some
clever guy on IRC #ruby said it would probably be OK to use. I was
actually using named pipes in an earlier version - and those have no
child/parent bindings - but the script would still hang for reasons
unknown.

For now I will do some forking manually to avoid Parallel.

Cheers,

Martin

himdel · January 21, 2014, 9:27am

On Mon, Jan 20, 2014 at 10:24 PM, Martin H. [email protected]
wrote:

This Parallel-less version is more of less following the example from
the IO.pipe docs, however, even if I can connect the commands in the
pipeline I fail to see how I can then run the commands:

I have a solution which connects commands but it still suffers from
your original issue (i.e. not terminating properly).

I better sleep on it :o/

Good plan!

Kind regards

robert