Reading from an external process with IO.popen

cluettr · November 5, 2009, 7:50pm

Hello all,

I’m trying to wrap my head around IO.popen with some simple examples
that send data to and read data from an
external process. Â I’ve create a sample case in the shell like this:

$ { echo hello ; sleep 2 ; echo world; } | cat
hello
world

I’ve written the same in ruby like so, which works:

$ cat foo.rb
#!/usr/bin/env ruby
if $0 == FILE
Â cat = IO.popen(“cat”, “w+”) ;
Â cat.puts("hello, ") ;
Â puts(cat.gets) ;
Â sleep 2 ;
Â cat.puts(“world”) ;
Â puts(cat.gets) ;
end

$ ./foo.rb
hello
world

However, if I change the cat command to a sed command, the ruby
version no longer works. Â The command-line equivalent does work, but
the ruby version waits forever and has to be interrupted:

$ { echo hello ; sleep 2 ; echo world; } | sed -ne p
hello
world

$ cat foo.rb
#!/usr/bin/env ruby
if $0 == FILE
Â cat = IO.popen(“sed -ne p”, “w+”) ;
Â cat.puts("hello, ") ;
Â puts(cat.gets) ;
Â sleep 2 ;
Â cat.puts(“world”) ;
Â puts(cat.gets) ;
end

$ ./foo.rb
./foo.rb:6:in `gets’: Interrupt
Â Â Â Â from ./foo.rb:6

Why does ruby work in the first case but wait forever in the second?

Using this version of ruby:

$ ruby -v
ruby 1.8.6 (2007-09-24 patchlevel 111) [i486-linux]

Thanks in advance for any pointers to references.

Regards,

Robert

cluettr · November 6, 2009, 11:16am

2009/11/5 Robert C. [email protected]:

I’ve written the same in ruby like so, which works:
end

$ ./foo.rb
hello
world

However, if I change the cat command to a sed command, the ruby
version no longer works. The command-line equivalent does work, but
the ruby version waits forever and has to be interrupted:

That’s probably because you do not close the write end of the pipe in
Ruby code. Also, it’s better to place the reading portion in a
separate thread in order to prevent deadlocks. And, please use the
block form of IO.popen which is more robust.

Try this pattern:

IO.popen(“cat”, “w+”) do |cat|

background output

t = Thread.new { cat.each {|l| puts l} }

main work

cat.puts "hello, "
sleep 2
cat.puts “world”

terminate processing:

cat.close_write
t.join
end

Kind regards

robert

cluettr · November 6, 2009, 4:02pm

On Fri, Nov 6, 2009 at 5:15 AM, Robert K.
[email protected] wrote:

2009/11/5 Robert C. [email protected]:

However, if I change the cat command to a sed command, the ruby
version no longer works. Â The command-line equivalent does work, but
the ruby version waits forever and has to be interrupted:

That’s probably because you do not close the write end of the pipe in
Ruby code.

Perhaps, but what if I don’t want to close the pipe? That is, I would
like to keep the pipe open so that I can send some data, read some
data and work on it, send some more data, read some more data and work
on it, etc. much like the process was a service, e.g. database. I am
trying to code the equivalent of a Call and Response. My examples
using cat and sed are just stand-ins for the real program.

BTW, the cat example works as expected, but the using sed doesn’t
work. That is, there is no output from sed until the pipe closes.
There seems to be some buffering going on. I’m guessing it’s from the
Ruby side since I don’t see this when run from the shell. But that’s
just a guess.

Of course, it’s entirely possible that IO.popen is not the “right” way
to tackle this and I have not discovered the Ruby way, yet.

Again, any pointers in the right direction are greatly appreciated.

Regards,

Robert

cluettr · November 6, 2009, 6:50pm

On 11/06/2009 04:01 PM, Robert C. wrote:

like to keep the pipe open so that I can send some data, read some
data and work on it, send some more data, read some more data and work
on it, etc. much like the process was a service, e.g. database. I am
trying to code the equivalent of a Call and Response. My examples
using cat and sed are just stand-ins for the real program.

If the program you are using does not cooperate you’re out of luck. For
example, if it assigns a huge read buffer then you might have to send
hundreds of lines before it even starts processing the first one. I
have no idea how the implementation of sed that you are using does it
but if you for example think of sort you cannot get any output before
the last line has been written and the write end of the pipe has been
closed.

BTW, the cat example works as expected, but the using sed doesn’t
work. That is, there is no output from sed until the pipe closes.
There seems to be some buffering going on. I’m guessing it’s from the
Ruby side since I don’t see this when run from the shell. But that’s
just a guess.

The shell closes the pipe as well. It is sed that is doing the
buffering and you have no control over it unless it provides an option
to control this.

Of course, it’s entirely possible that IO.popen is not the “right” way
to tackle this and I have not discovered the Ruby way, yet.

No, it’s the right way but your expectations cannot be met in all cases.

Kind regards

robert

cluettr · November 6, 2009, 8:15pm

On Fri, Nov 6, 2009 at 12:50 PM, Robert K.
[email protected] wrote:

On 11/06/2009 04:01 PM, Robert C. wrote:

BTW, the cat example works as expected, but the using sed doesn’t
work. Â That is, there is no output from sed until the pipe closes.
There seems to be some buffering going on. Â I’m guessing it’s from the
Ruby side since I don’t see this when run from the shell. Â But that’s
just a guess.

The shell closes the pipe as well. Â It is sed that is doing the buffering
and you have no control over it unless it provides an option to control
this.

Yes, it appears that the external program is controlling the
buffering. When I tried the same process with the program I really
wanted to use, IO.popen worked pretty much the way I wanted it to.
The pattern was this:

foo = io.popen(“external_program”, “w+”)
while data = gets
prepare data
foo.puts(data)
while not end of record
newdata += foo.readlines
end
process newdata
end
foo.close

Turns out that the program I used has a signal to signify the end of a
chunk of data. So the program knows when I am finished sending data
and it can start crunching away. And I know when I can stop reading
data from the pipe and begin processing it. This saves the time of
repeatedly having to open and close the pipe.

Of course, it’s entirely possible that IO.popen is not the “right” way
to tackle this and I have not discovered the Ruby way, yet.

No, it’s the right way but your expectations cannot be met in all cases.

It’s nice to know that I’m at least on the right track, or on one of
many possible right tracks.

Thanks for your help.

Regards,

Robert