Pipes and flush

stephanwehner · May 25, 2007, 7:28pm

So I wrote a little Ruby script using pipes roughly like this

– BEGIN SCRIPT ----

#!/usr/bin/ruby -n

def process(line)
f = IO.popen(‘some_executable’, ‘w’)
f.puts <<END_INPUT
some executable commands
#{line}
some executable commands
END_INPUT

f.close

puts
puts line

end

process $_ unless $_ =~ /#/

– END SCRIPT -----

Now this works nicely, I invoke it with

./script input_file > result

the “some_executable” prints to STDOUT results as I expect (given the
lines from input_file) and I can read the results in the “result” file

However, the output of the puts / puts line at the end of
the process method does not occur at the places I expect.

The output of “some_executable” for several “lines” shows up on STDOUT,
then the output of the puts / puts line for several “lines” shows up on
STDOUT, whereas they should simply alternate (depending on the “line”
value).

Am I missing something about pipes, and how the input/output streams are
hooked up?

I am guessing that a calling STDOUT.flush before and after the puts/puts
line statements will repair this, but I’m wondering why this would be
necessary.

This is with ruby 1.8.4 (2005-12-24) [i386-linux]

Thanks

Stephan

stephanwehner · May 25, 2007, 9:08pm

Stephan W. wrote:

However, the output of the puts / puts line at the end of
the process method does not occur at the places I expect.

The output of “some_executable” for several “lines” shows up on STDOUT,
then the output of the puts / puts line for several “lines” shows up on
STDOUT, whereas they should simply alternate (depending on the “line”
value).

Am I missing something about pipes, and how the input/output streams are
hooked up?

This has nothing to do with pipes. Ruby is buffering output streams:
the executed command necessarily flushes its output when it terminates
but nothing in Ruby guarantees that the buffer in the (Ruby) stream will
be flushed before the start of the next command.

I’ll try to make it more clear: the buffering is done within Ruby, so
the underlying operating system knows nothing about the two lines you
want to output, and happily just writes what the next executed commands
writes.

Hope this helps,

Vincent

stephanwehner · May 25, 2007, 11:23pm

Vince H&K wrote:

This has nothing to do with pipes. Ruby is buffering output streams:
the executed command necessarily flushes its output when it terminates
but nothing in Ruby guarantees that the buffer in the (Ruby) stream will
be flushed before the start of the next command.

I’ll try to make it more clear: the buffering is done within Ruby, so
the underlying operating system knows nothing about the two lines you
want to output, and happily just writes what the next executed commands
writes.

Hope this helps,

You’re saying the output from “some_executable” is passed directly on to
the operating systems STDOUT, while what puts prints on stdout is
buffered by Ruby?

That would explain what is happening. (But why would Ruby introduce
additional buffering?)

Stephan

Vincent

stephanwehner · May 26, 2007, 3:46pm

On Sat, May 26, 2007 at 06:23:55AM +0900, Stephan W. wrote:

writes.

Hope this helps,

You’re saying the output from “some_executable” is passed directly on to
the operating systems STDOUT, while what puts prints on stdout is
buffered by Ruby?

If Ruby goes via the stdio library (fprintf and friends) then it will be
buffered by libc. If not (i.e. it uses write and friends) then it would
make
sense for Ruby to add its own layer of buffering. In both cases this is
so
that it’s not horrendously inefficient when writing, say, lots of single
characters in a loop.

To defeat this, look at
IO#sync = true

Or call flush before you fork and exec any external programs which write
to
the stdout they share with their parent.

Regards,

Brian.