How to determine if pipe is given

On Sep 27, 2006, at 11:55 PM, greg wrote:

I wanted the ability to optionally pipe a file, but the program would
not require a file to be piped or given as an argument. I guess the
answer is that this cannot be done. I can only allow an optional
file
argument, not a pipe.

I’m not sure I understand what you mean by ‘pipe a file’. Pipes and/or
file redirection are arranged by the shell not by the programs running
under the shell. The ‘normal’ way this is handled is:

program file		# program opens and reads from named file
program			# program simply reads from standard input

In the second case, standard input can be connected to all sorts of
things
depending how the shell sets up the environment:

program
program < /from/file
source | program

In all three cases, the program simply reads from standard input and is
oblivious to whether standard input is associated with the terminal, a
file, or a pipe. All the ‘plumbing’ is done by the shell prior to
executing
the program.

Ruby does give you a nice way to read from either standard input or
a list of files named on the command line via the special ARGF object.
So you simply let ARGF examine the command line arguments and decide if
it should get data from STDIN or if should open and read data from the
named files.

ARGF is a convenience, it isn’t doing anything you couldn’t do
explicitly
on your own by processing ARGV yourself.

I don’t know if that confuses things or not as I don’t really understand
your objective.

Gary W.

I wanted the ability to optionally pipe a file, but the program would
not require a file to be piped or given as an argument. I guess the
answer is that this cannot be done. I can only allow an optional file
argument, not a pipe.

I guess the problem comes down to: can I determine if there is data
being piped or if there is data waiting on STDIN before reading. I am
really trying to avoid hanging from doing ARGF.readlines (or something
similiar) if there is no data being piped.

If a file is given as an argument it will show up in ARGV, and we know
it is there, and can act accordingly. Is there any way we can know
that piped data is there?

I avoid at all costs writing a program that will hang under any
conditions. For me, a program should always fail gracefully, it should
not lose control. If it is invoked incorrectly, there should be a
message explaining how to invoke it correctly, it should not just hang.

On Thu, 28 Sep 2006, greg wrote:

I guess the problem comes down to: can I determine if there is data
being piped or if there is data waiting on STDIN before reading. I am
really trying to avoid hanging from doing ARGF.readlines (or something
similiar) if there is no data being piped.

windows or *nix?

-a

greg wrote:

I wanted the ability to optionally pipe a file, but the program would
not require a file to be piped or given as an argument. I guess the
answer is that this cannot be done. I can only allow an optional file
argument, not a pipe.

But you can have both, it’s done all the time. As has been explained, if
the
first argument is a file name, you open the file, If the first argument
is
a dash, this is your signal to read from STDIN. This is a classical
arrangement and it plays well with cron, because the program relies
completely on its explicit arguments, not stream tests.


#!/usr/bin/ruby -w

data = “”

if ARGV
if ARGV[0] == ‘-’
data = STDIN.read
elsif FileTest.exists? ARGV[0]
data = File.read(ARGV[0])
end
end

puts data

so the answer is that I must rely on the user to indicate what they are
doing. And if the user gives a ‘-’ and there is no pipe than the
program will hang. This is not so bad, but I was just hoping there was
a better way.
I think though there should be some way to avoid this. I don’t know if
there is lower level information here that we cannot extract in ruby,
but I also wonder if there is a way to avoid this in Ruby: for example
read from STDIN for just one second, and then determine if any data was
read. I think though in UNIX that when you pipe programs from the
shell that all the programs are opened up immediately, so even this
could have problems.

On 9/28/06, greg [email protected] wrote:

I wanted the ability to optionally pipe a file, but the program would
not require a file to be piped or given as an argument. I guess the
answer is that this cannot be done.

I am afraid so, I do not know any program that reads from STDIN that
will
not block if there is none,
Ara’s idea is how it is done in Unix, I do not know why he got so much
critisized for it :frowning:

I can only allow an optional file

argument, not a pipe.

Hmm Greg what I came up with on my Ubuntu and I will happily test it on
Windows and Sarge (really should behave like my old Ubunto) is not
elegant
but might work for you

You have a pipe for sure if

 ! STDIN.tty? && ENV["TERM"]

but now your script will not read from STDIN in cron, but it will not
block,
this is a hary thing, test it carefully please as
it still might block in various situations and again you only can read
from
STDIN in case you are attached to a terminal!!

HTH anyway
Robert


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

On 9/28/06, [email protected] [email protected] wrote:

<Ara’s wisdom snipped, real good stuff, hopefully you read it>

i just use ‘-’, or simply always read from STDIN, and then sleep at night
:wink:

which is a very wise decision :)), nevertheless if Greg dies for it,
with
some performance loss one could do the following, I 'll give pseudo
code I
am bad in Thread programming

begin
timeout(3) do
STDIN.eof?
end
rescue TimeoutError
Do something without STDIN
Good bye
end

Do something with STDIN

Now this is far from perfect but should catch about 99.9% of the use
cases.
Anything I missed?
YUP the treatment might
BTW Greg, what do you want to do without a pipe, tell the loser that he
forgot to put one :wink:

Cheers
Robert

-a

in order to be effective truth must penetrate like an arrow - and that is
likely to hurt. – wei wu wei

Ouch


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

On Sep 28, 2006, at 1:03 AM, Robert D. wrote:

Ara’s idea is how it is done in Unix, I do not know why he got so much
critisized for it :frowning:

Actually I think the ‘normal’ thing in Unix is to simply do blocking
reads
from stdin if there are no file arguments. This is exactly what ARGF
does.
It is decidedly non-standard to require the use of a sentinel such
as ‘-’ (or even better /dev/stdin) to cause a program to read from
stdin.
And I believe this is what the OP is looking for. I just don’t think it
makes a lot of sense. For example, standard input redirection would
have
to be done as:

program - < /some/file/to/read

which is strange. Pipes also:

source | program -

In both cases if you leave off the ‘-’ the program will report an error
rather than reading and processing the data. The next programmer
who comes along is going to be very confused by this behavior.

Can anyone think of a standard Unix program that will only read from
stdin in if you give an explicit sentinel as a file argument?

Gary W.

On Thu, 28 Sep 2006, greg wrote:

so the answer is that I must rely on the user to indicate what they are
doing. And if the user gives a ‘-’ and there is no pipe than the program
will hang. This is not so bad, but I was just hoping there was a better
way. I think though there should be some way to avoid this. I don’t know
if there is lower level information here that we cannot extract in ruby, but
I also wonder if there is a way to avoid this in Ruby: for example read from
STDIN for just one second, and then determine if any data was read. I think
though in UNIX that when you pipe programs from the shell that all the
programs are opened up immediately, so even this could have problems.

it’s so much more terrible than you realize. you can tell if there is
data
ready on stdin:

harp:~ > cat a.rb
require ‘io/wait’

if STDIN.ready? # then it either has data or is at eof
buf = STDIN.read
puts buf
else
puts ‘no pipe’
end

harp:~ > ruby -e’ puts 42 ’ | ruby a.rb
“42”

now, you may think that’s great. but run it a few times.



harp:~ > ruby -e’ puts 42 ’ | ruby a.rb
no pipe

see, it’s a race condition. maybe the shell has setup the two
processes
and stdin of one is connected to the stdout of the other, then again
maybe
it’s not quite there yet. io is pain. from now on i’m writing only
functional programs with no input or output.

any sort of non-blocking read on STDIN will do the same thing. that’s
why all
well behaved programs do indeed block reading stdin when none is ready -
it’s
the only safe thing to do. btw - stevens book is amazing…

i just use ‘-’, or simply always read from STDIN, and then sleep at
night :wink:

-a

On Fri, 29 Sep 2006 [email protected] wrote:

In both cases if you leave off the ‘-’ the program will report an error
rather than reading and processing the data. The next programmer
who comes along is going to be very confused by this behavior.

not if they work on unix! :wink:

Can anyone think of a standard Unix program that will only read from
stdin in if you give an explicit sentinel as a file argument?

tarfile on stdout

harp:~ > tar -cf - directory > tarfile_on_stdout.tgz

piped tarfile on stdin

harp:~ > tar -cf - directory | tar -xvf -

copying files overnetwork via piped tarfile and unpacking on the other

side

harp:~ > (cd /src; tar -cvf - foo) | (ssh other.machine ‘cd /dst; tar
-xf -’)

echo foobar into a gzip of stdin, sending compressed output to stdout,

pipe

that into another gzip which is decompressing stdin, dump that output

back

out to stdout

harp:~ > echo foobar | gzip - -c | gzip -d - -c
foobar

use image magick’s convert command to convert stdin -> stdout

harp :~ > convert - - < map.png > map2.png
harp :~ > file map2.png
map2.png: PNG image data, 713 x 569, 8-bit/color RGB, non-interlaced

grep for list of patterns on stdin

harp:~ > echo alias | grep -f- .bashrc
alias new=‘ls -ltar’
alias p=“fetchmail;pine -i -passfile /home/ahoward/.passfile”
alias g=“glimpse -n -H”
alias gi=‘glimpseindex -B -t -f -H’
alias ldate=‘env TZ=America/Denver date’
alias ssh=‘ssh -X’
alias vi=‘vim’
alias mussel=‘tti -A [email protected]
alias ss='screen -S ’
alias sl='screen -list ’
alias sdr='screen -d -r ’
alias s='screen -D -R ’
alias dark=‘eval dircolors /etc/DIR_COLORS && export
DIR_COLORS=dark’
alias light=‘eval dircolors /etc/DIR_COLORS.xterm && export
DIR_COLORS=light’
alias xt=“xterm -font 7x13 -fb 7x13B -geometry 80x25 -sb -wf -j -ls
-bg Black -fg grey &”

note that none of these are documented. i think that’s because it’s
considered
‘standard’ bahaviour.

cheers.

-a

On Fri, 29 Sep 2006 [email protected] wrote:

note that none of these are documented. i think that’s because it’s
considered
‘standard’ bahaviour.

oh yeah - nearly forgot my favourite: curl. it’s takes about six
billion bits
of information optionally on stdin

harp:~ > PAGER=cat man curl|grep -B1 -A1 stdin
file name to read the data from, or - if you want curl
to read
the data from stdin. The contents of the file must
already be
url-encoded. Multiple files can also be specified.
Posting data

           To read the file's content from stdin insted of a  file, 

use -
where the file name should’ve been. This goes for both
@ and <

           Specify the filename as '-' to make  curl  read  the 

file from
stdin.

           Use  the file name "-" (a single dash) to use stdin 

instead of a
given file.

           "string",  to  get  read  from  a particular file you 

specify it
@filename” and to tell curl to read the format from
stdin you
write “@-”.

harp:~ > echo http_code | curl --write-out - http://ruby-lang.org

302 Found

Found

The document has moved here.

it’s even documented - how un-unixish! :wink:

-a

On Fri, 29 Sep 2006 [email protected] wrote:

not if they work on unix! :wink:

Huh? That is like saying that grep at the end of a pipeline is expected
to be called as:

source | grep ‘pattern’ -

instead of simply

source | grep ‘pattern’

you are right. i’m just saying it’s but not uncommon to sentinal stdin
via
‘-’.

Hmm. Tar is a bit of a special case, isn’t it? It’s default behavior is to
access devices in /dev. You don’t need -f, if you are using the default
devices.

for output yes. for input no.

But gzip works just fine without all those extra arguments:

echo foobar | gzip > foobar.gz
echo foobar | gzip | gzip -d -c

that is true. i’m merely pointing out the standardness of ‘-’, not it’s
required-ness…

doesn’t work?
yes.

harp:~ > file map.png
map.png: PNG image data, 640 x 480, 8-bit colormap, non-interlaced

harp:~ > convert < map.png > map2.png

harp:~ > file map2.png
map2.png: ASCII English text

note that the output file is the output of --help. note the last 5
lines :wink:

harp:~ > tail -5 map2.png
By default, the image format of `file’ is determined by its magic
number. To
specify a particular image format, precede the filename with an image
format
name and a colon (i.e. ps:image) or specify the image type as the
filename
suffix (i.e. image.ps). Specify ‘file’ as ‘-’ for standard input or
output.

If you look back at my posting I said that a program that requires the use
of ‘-’ would be unusual. I understand why some programs support the ‘-’
notation, which is still a hack if you’ve got /dev/stdin or /dev/stdout
available.

i personally wouldn’t quite call it a hack tough, curl, wget, tar, gzip,
image
magic - these are all commands which will compile on many oses. some of
them
even compile on windows! :wink:

i suppose it just depends on what kinds of command one is using and in
what
kinds of environments with respect to whether one finds ‘-’ odd or not.
i run
many things under cron and in a clustered environment and have grown
accustomed to it. i admit it’s not common for tty usage.

Maybe I totally missed the point but I thought the OP was making the
requirement that his program would not bother to read from stdin unless it
saw something special on the command line, which led to your suggestion of
the ‘-’. My point is that the OP’s requirement would be very unexpected for
the usual Unix programmer.

sorry. you are right here of course. i’m going to bow out of this
thread -
the last thing i’ll say is that i’ve been bitten by automatic reading of
stdin
and, even though many unix standards do it (cat, gzip, etc) i’ve noticed
newer
and more complicated, eg. curl, programs seemed to have shyed away from
it. i
have too, just because it’s bitten me with hung programs in a clustered
environment (no tty).

anyhow - this horse is dead! :wink:

kind regards.

-a

On 9/27/06, [email protected] [email protected] wrote:

apparently in perl, if there is a piped input, ‘-’ will show up
automatically in ARGV. I think I will propose this change to Ruby.

eeeks! that’s is pure evil!

Yes I agree with this, but…

consider i have many, many programs which do things like

convert infile outfile
convert - outfile # infile on stdin
convert infile - # outfile on stdout
convert - - # infile on stdin, outfile on stdout
convert --infiles=- # list of infiles on stdin, auto-name outfiles

this is standard unix practice (do a man on gzip, tar, etc).

Standard but by no means universal. For example the mysql command
doesn’t use or like it:

rick@frodo:/public/rubyscripts$ mysql -p -
Enter password:
ERROR 1049 (42000): Unknown database ‘-’

It either tries to use - as the database name or:

rick@frodo:/public/rubyscripts$ mysql -p depot_development -
mysql Ver 14.12 Distrib 5.0.22, for pc-linux-gnu (i486) using readline
5.1

Usage: mysql [OPTIONS] [database]

It tells me that I don’t know the right syntax.

I don’t think that it has been mentioned here that one reason for
detecting what’s connected to stdin/stdout is useful when a tool
want’s to have interactive and non-interactive modes:

Here’s mysql in non-interactive mode:

rick@frodo:/public/rubyscripts$ echo “describe products;” | mysql -p
depot_development
Enter password:
Field Type Null Key Default Extra
id int(11) NO PRI NULL auto_increment
title varchar(255) YES NULL
description text YES NULL
image_url varchar(255) YES NULL
price decimal(8,2) YES 0.00
rick@frodo:/public/rubyscripts$

And here it is in interactive mode:
rick@frodo:/public/rubyscripts$ mysql -p depot_development
Enter password:
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5691 to server version:
5.0.22-Debian_0ubuntu6.06-log

Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the buffer.

mysql> describe products;
±------------±-------------±-----±----±--------±---------------+
| Field | Type | Null | Key | Default | Extra |
±------------±-------------±-----±----±--------±---------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| title | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| image_url | varchar(255) | YES | | NULL | |
| price | decimal(8,2) | YES | | 0.00 | |
±------------±-------------±-----±----±--------±---------------+
5 rows in set (0.01 sec)

mysql>

Note that it not only puts out prompts but it changes the output to
be for human rather than computer consumption.

auto-munging of ARGV is a bad idea imho.

I agree but for other reasons.

As for programs hanging if you do

program &

Well, that’s really a user error, and maybe even not that, maybe I
want to suspend program and then use fg to resume it.

And - doesn’t really help this. I think that it’s better in most
cases to solve it the other way around with something like

program </dev/null &


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Thanks for all the info guys. Here is what I came up with.
This code seems invincible so far- it even works with &

require ‘thread’

input = nil
t = Thread.new() do
input = STDIN.readlines
end

do some stuff

parse some options

while t.alive?
old = input
puts “thread running for 1 second”
Kernel.sleep(1)

if old == input
if input
puts “done collecting input”
else
puts “no input found on STDIN”
end
Kernel.sleep(1)

 t.kill
 break

else
puts “done collecting input”
Kernel.sleep(1)
end
end

puts input

Greg

small typo there, this is better…

require ‘thread’

input = nil
t = Thread.new() do
input = STDIN.readlines
end

do some stuff

parse some options

while t.alive?
old = input
puts “collecting input…” if old

puts “thread running for 1 second”
Kernel.sleep(1)

if old == input
if input
puts “done collecting input”
else
puts “no input found on STDIN”
end
Kernel.sleep(1)
t.kill
break
end
end

puts input

On Sep 28, 2006, at 12:09 PM, [email protected] wrote:

On Fri, 29 Sep 2006 [email protected] wrote:

In both cases if you leave off the ‘-’ the program will report an
error
rather than reading and processing the data. The next programmer
who comes along is going to be very confused by this behavior.

not if they work on unix! :wink:

Huh? That is like saying that grep at the end of a pipeline is expected
to be called as:

source | grep 'pattern' -

instead of simply

source | grep 'pattern'

Can anyone think of a standard Unix program that will only read
from
stdin in if you give an explicit sentinel as a file argument?

tarfile on stdout

harp:~ > tar -cf - directory > tarfile_on_stdout.tgz

Hmm. Tar is a bit of a special case, isn’t it? It’s default behavior
is to access devices in /dev. You don’t need -f, if you are using the
default devices.

echo foobar into a gzip of stdin, sending compressed output to

stdout, pipe

that into another gzip which is decompressing stdin, dump that

output back

out to stdout

harp:~ > echo foobar | gzip - -c | gzip -d - -c
foobar

But gzip works just fine without all those extra arguments:

echo foobar | gzip > foobar.gz
echo foobar | gzip | gzip -d -c

use image magick’s convert command to convert stdin -> stdout

harp :~ > convert - - < map.png > map2.png
harp :~ > file map2.png
map2.png: PNG image data, 713 x 569, 8-bit/color RGB, non-interlaced

Are you saying that

convert < map.png > map2.png

doesn’t work?

If you look back at my posting I said that a program that requires
the use of ‘-’
would be unusual. I understand why some programs support the ‘-’
notation, which is
still a hack if you’ve got /dev/stdin or /dev/stdout available.

Maybe I totally missed the point but I thought the OP was making the
requirement that
his program would not bother to read from stdin unless it saw
something special on
the command line, which led to your suggestion of the ‘-’. My point
is that the OP’s
requirement would be very unexpected for the usual Unix programmer.

should be
old = input ? input.dup : nil

However, a duration of time must be specified to wait for the pipe or
timeout. So I guess IO is pain :slight_smile: