Jumping to "the next one" in something#each

I’m sorry that this message will be long and somewhat rambling,
but I wanted to try to write up my thoughts on this, even though
my thoughts are not very clear…

There’s a situation that I occassionally run into, which I can work
around easy enough, but it seems that there might be an cleaner
solution than the ones I fall back on. Let’s say I want to have
something like:

ARGV.each { |arg|
case arg
when “-f”
…do something with the next value of |arg|…
end
}

What I do is set some variable to remember that I’m in the middle
of “-f” processing, and then catch that when the code-block is
executed for the next value. For example:

in_option = nil
ARGV.each { |arg|
if in_option
case in_option
when “-f”
…do the -f processing of this ‘arg’ value…
end
in_option = nil
next
end
case arg
when “-f”
in_option = arg
end
}
if in_opt
…some error message about the missing value for an option…
end

This can be made to work well, but it’s a mess in many ways. Please
note that I’m not asking for everyone’s favorite package for processing
ARGV, since this same situation comes up (for me at least) in many
other contexts. I am only using ARGV as an convenient example
because everyone will understand what I’m talking about. Also note
that the same situation can come up with ‘each_value’, ‘each_key’,
‘each_pair’, or other kinds of ‘each’-ish methods.

What I’d like is some way for the ‘-f’ case to say “give me the next
values for ‘each’” (perhaps multiple times in a row). If there is no
next-value, then it would get a ‘nil’ value.

And to complicate things a bit more, sometimes the option might want
to check the next option, but not necessarily use it up. For
instance,
if the “-f” option can take one-or-more names, then it would keep taking
the next value for ‘arg’ until it sees arg =~ /^-/

It seems to me that this situation can come up in enough contexts that
the ruby language could provide a way to do it. I have no good idea of
what wording would make sense for this. But the idea would be
something like:

ARGV.each { |arg|
case arg
when “-f”
opt_value = next_yield(0)
if opt_value and opt_value =~ /^-/
skip_next_yield
…do something with that opt_value…
else
$stderr.printf “Error: Missing value for ‘-f’\n”
opterr = true
end
end
}

So, what I’m suggesting is two new keywords for ruby. The first would
know what the value(s) would be for the next call to this code-block,
but
it would not change anything. The ‘skip_next_yeild’ would delete that
next value (or set-of-values for methods like each_pair). I added the
parameter to next_yield so that it could also handle the values of
methods like each_pair, eg:
somehash.each_pair { |key, strval|
…whatever…
next_key = next_yield(0)
next_strval = next_yield(1)
}
I guess next_yield is really an array more than a keyword which takes
a parameter, but I thought it (next_yield) would be more flexible if it
could support other options with additional parameters.

It might be that ‘skip_next_yield’ would also take a parameter, which
would be the value the caller should use for the current execution of
the code-block, for those cases where the caller cares about the
return-value from the code-block. ‘each’ does not care, but a method
like ‘sort’ would care about the value.

I’m sure I’m overlooking some details on how this would have to work,
and I’m sure that ‘next_yield’ and ‘skip_next_yield’ are dumb names for
those two new keywords. But I think that some feature like the above
could be useful.

Or is there already some good way to handle this in ruby?

On Feb 9, 2007, at 5:43 PM, Garance A Drosehn wrote:

}
Switch this to:

list = ARGV.dup # leave ARGV alone

while item = list.shift
case arg
when “-f”
file = list.shift
end
end

You can shift/unshift items as your logic dictates. When there
is nothing left the loop ends. You can break out of the loop
leaving items in ARGV via ‘break’.

There are also several classes designed to process command line
arguments (e.g., GetoptLong from the standard library) that you could
use instead of rolling your own.

Gary W.

case arg
when “-f”
…do something with the next value of |arg|…
end
}

What I do is set some variable to remember that I’m in the middle
of “-f” processing, and then catch that when the code-block is
executed for the next value. For example:

Well, if you don’t mind trashing the array you’re processing you could
use
Array.shift to remove the first element and return it… then just
re-order your loop to exit once the element returns is nil.

-philip

On 2/9/07, Gary W. [email protected] wrote:

when “-f”
…do something with the next value of |arg|…
end
}

Switch this to:

list = ARGV.dup # leave ARGV alone

while item = list.shift

 case arg
 when "-f"
    file = list.shift
 end

end

Again, I am not looking for any ARGV-specific solution, or even a
solution which only works for arrays. I run into the same situation
in other contexts than ARGV. I see this as an issue which is bigger
than just the Array class.

And yes, I have been able to work around all of those situations by
creating various temp-variables or using other hacks, but I always
feel like I’m “working around” the problem, instead of coming up with
a nice clean pattern that solves it. I have this vague feeling that
there “should” be some cleaner way to handle all these situations.

…even if I can’t describe it very well! :slight_smile:

On 2/9/07, Philip H. [email protected] wrote:

}

What I do is set some variable to remember that I’m in the middle
of “-f” processing, and then catch that when the code-block is
executed for the next value. For example:

Well, if you don’t mind trashing the array you’re processing you could use
Array.shift to remove the first element and return it… then just
re-order your loop to exit once the element returns is nil.

That’s not a general solution, though. Consider an example of:

filetree.added_files.keys.sort.each { |fname|
    ...whatever...
}

I’d have to create a temp-array to hold the result from the sort step.
For that matter, I never want to destroy an array when I’m looping
through the values in it. If I’m using the ‘each’ method, then the
array-object is never modified.

Besides, I wanted something which works for any routine which
takes a code-block, not just objects of type Array. How could you
make that work for the each_pair method of Hash objects?

I was thinking there should be some “big-picture” solution possible,
one which would work for all methods which take a code-block
parameter. That is probably too complicated a goal, but at least a
solution which would work for some large subset of those methods.

On Feb 9, 2007, at 6:23 PM, Garance A Drosehn wrote:

I should also say that I am not looking for a quick-fix for some
specific script that I am writing. This is more of a “blue-sky”
topic,
where I am thinking that maybe some future version of ruby could
introduce a new feature for these situations.

I think that using an array as a queue for processing items is
pretty general. The idea of pushing things back into a queue is
also a reasonably common pattern. You suggested that you didn’t
want to change your array during iteration, but if you simply
think of the array as a queue of items then it certainly makes
sense to alter the queue as you process items. You can always dup
the original array if you need continued access to the original
collection.

I gave an illustration with ARGV but the pattern works for any type
of a collection. You can always use Enumerable#to_a to convert
something to an array:

hash = { 1 => 2, 3=>4 }
pairs = hash.to_a

while pair = pairs.shift

the next pair is available as pairs[0], if necessary

end

Is there some particular reason you prefer iteration via #each over
a while loop? It seems like the problem you described is more general
than the type of iteration provided by #each and so it needs a more
general solution (like a while loop). As you’ve said, you’ve managed
to shoehorn in a solution with #each via flags and such, but
perhaps that indicates that #each is simply the wrong pattern for your
use cases.

It isn’t entirely clear to me from the single example what ‘pattern’
you are trying to implement. Perhaps posting another use
case would generate some more suggestions.

Gary W.

On 2/9/07, Garance A Drosehn [email protected] wrote:

Besides, I wanted something which works for any routine which
takes a code-block, not just objects of type Array. How could you
make that work for the each_pair method of Hash objects?

I was thinking there should be some “big-picture” solution possible,
one which would work for all methods which take a code-block
parameter. That is probably too complicated a goal, but at least a
solution which would work for some large subset of those methods.

I should also say that I am not looking for a quick-fix for some
specific script that I am writing. This is more of a “blue-sky” topic,
where I am thinking that maybe some future version of ruby could
introduce a new feature for these situations.

I think the kind of solution that I’m looking for would require a change
to ruby itself. However, I also realize that I may have missed some
features which are already in ruby, and which already do exactly what
I want.

On Sat, Feb 10, 2007 at 08:23:23AM +0900, Garance A Drosehn wrote:

solution which would work for some large subset of those methods.
I want.

require ‘generator’
require ‘enumerator’

g = Generator.new([1,2,3])

until g.end?
p g.next
end

def happy
yield 1
yield 2
yield 3
end

g2 = Generator.new(Enumerable::Enumerator.new(self, :happy))

until g2.end
p g2.next
end

On Feb 9, 11:43 pm, “Garance A Drosehn” [email protected] wrote:

case arg
when "-f"
    ...do something with the *next* value of |arg|...
end

}

ARGV.each_with_index { |arg, index|
case arg
when “-f”
… do something with ARGV[index + 1]…
end
}

Cheers,

Nico

For things that support each_cons (Arrays do):
[1, 2, 3, 4, 5, 6].each_cons(2) {|current_item, next_item| p
current_item if next_item == 3 }
#=> 2

For things that only support the each method:
require ‘enumerator’
[1, 2, 3, 4, 5, 6].enum_for(:each).each_cons(2) {|current_item,
next_item| p current_item if next_item == 3 }

So, for your example, each_cons will work on ARGV.

Dan

On Sat, Feb 10, 2007 at 08:11:16AM +0900, Garance A Drosehn wrote:

I’d have to create a temp-array to hold the result from the sort step.

Note that Array#sort does always generate a temp array.

(a.sort is implemented internally as a.dup.sort!)

Besides, I wanted something which works for any routine which
takes a code-block, not just objects of type Array. How could you
make that work for the each_pair method of Hash objects?

Change h.each_pair to h.to_a (which also creates a temp array, of
course)

In the most general case, you might need to explicitly construct the
array,
changing
foo.each { |x,y,z| … }
to something like
foo.map { |*x| x }

But setting this aside: I think it is possible to get the pattern you
asked for - i.e. being able to ‘pop’ the next value from an enumerable
on
demand, without first flattening it into an array. Have a look at the
Generator class:

require ‘generator’

a = [“arg”, “-f”, “one”, “-c”, “two”, “final”]

g = Generator.new(a)
while g.next?
arg = g.next
if arg == “-f”
file = g.next
puts “File: #{file}”
elsif arg == “-c”
conf = g.next
puts “Conf: #{conf}”
else
puts “Arg: #{arg}”
end
end

Generators are implemented using continuations, which are not very
efficient. So you sacrifice speed for convenience.

And to complicate things a bit more, sometimes the option might want
to check the next option, but not necessarily use it up. For instance,
if the “-f” option can take one-or-more names, then it would keep taking
the next value for ‘arg’ until it sees arg =~ /^-/

Your generator then needs some buffering to be able to fetch and
remember
the next value. But we already have this; Generator#current tells you
the
‘next’ item without actually removing it.

So try this for size:

require ‘generator’

module Enumerable
def geneach
g = Generator.new(self)
while g.next?
yield g, g.next
end
end
end

a = [“arg”, “-f”, “one”, “two”, “three”, “-c”, “conffile”, “final”]

a.geneach do |g, arg|
if arg == “-f”
while g.current !~ /^-/
file = g.next
puts “File: #{file}”
end
elsif arg == “-c”
conf = g.next
puts “Conf: #{conf}”
else
puts “Arg: #{arg}”
end
end

There are probably some rough edges to sort out in this pattern (esp.
what
happens if the enumerator yields multiple values) but you get the idea.

Now, doing all this with generators is a general-purpose solution; it
works
with any Enumerable (i.e. anything which implements the ‘each’ method),
but
as I said before, it’s inefficient. You can optimise this by making a
custom
geneach method for Arrays, which will be much faster.

This gives:

class ArrayGenerator
def initialize(a)
@a = a
@index = 0
end
def next?
@index < @a.size
end
def next
v = @a[@index]
@index += 1
v
end
def current
@a[@index]
end
#can also add other methods to match Generator, e.g.
#def rewind

@index = 0

#end
end

class Array
def geneach
g = ArrayGenerator.new(self)
while g.next?
yield g, g.next
end
end
end

a = [“arg”, “-f”, “one”, “two”, “three”, “-c”, “conffile”, “final”]

a.geneach do |g, arg|
if arg == “-f”
while g.current !~ /^-/
file = g.next
puts “File: #{file}”
end
elsif arg == “-c”
conf = g.next
puts “Conf: #{conf}”
else
puts “Arg: #{arg}”
end
end

Notice that the ‘main’ program which parses the array is unchanged.

You can of course have both implementations loaded at once. So Arrays
will
get the efficient behaviour, and anything else that is Enumerable will
get
the continuation implementation.

HTH,

B.

Garance A Drosehn wrote:

ARGV.each { |arg|
case arg
when “-f”
…do something with the next value of |arg|…
end
}

The following works with any Enumerable, doesn’t use threads,
continuations, or generators, doesn’t construct temp arrays (except to
hold the next value(s)) or do any shifting on the original argument, and
has fairly nice syntax (but you have to remember not to fall off the end
of your block with an integer value). It handles corner cases AFAICT.

Any suggestions for a better name than #each2?

module Enumerable
class InsufficientDataError < StandardError; end

def each2
req = first = rest = nil

 each do |x|
   if req
     req -= 1
     rest << x
     next if req > 0

     req = yield first, rest
     first = nil
     rest = nil
   else
     req = yield x, nil
   end

   if req
     if req.kind_of?(Integer) and req > 0
       first = x
       rest = []
     else
       raise ArgumentError, "Bad request: next #{req.inspect}"
     end
   end
 end

 if req and req > 0
   raise InsufficientDataError,
     "Ran out of data fulfilling request: next #{req}"
 end

end
end

(1…10).each2 do |x, rest|
next 2 if (x == 3 or x == 7) and not rest
puts “x = #{x}, rest = #{rest.inspect}”
nil ## make sure not to request more entries by returning an integer
end

%w{ -f foo -b bar --copy src dst }.each2 do |arg, values|
case arg
when “-f”, “-b”
next 1 unless values
puts “#{arg} #{values[0]}”

when “–copy”
next 2 unless values
puts “#{arg} #{values.join(” “)}”
end

nil ## make sure not to request more entries by returning an integer
end

END

Output:

x = 1, rest = nil
x = 2, rest = nil
x = 3, rest = [4, 5]
x = 6, rest = nil
x = 7, rest = [8, 9]
x = 10, rest = nil
-f foo
-b bar
–copy src dst

On Sat, 10 Feb 2007, Garance A Drosehn wrote:

Besides, I wanted something which works for any routine which
takes a code-block, not just objects of type Array. How could you
make that work for the each_pair method of Hash objects?

I was thinking there should be some “big-picture” solution possible,
one which would work for all methods which take a code-block
parameter. That is probably too complicated a goal, but at least a
solution which would work for some large subset of those methods.

this pattern will work with most nearly any method

harp:~ > cat a.rb
require ‘thread’

argv = %w( -v 4 -l log foobar )

q = SizedQueue.new 1
done = Object.new.freeze
t = Thread.new{ argv.each{|arg| q.push arg}; q.push done }

while( arg = q.pop )
break if arg == done
case arg
when ‘-v’
p [arg, q.pop]
when ‘-l’
p [arg, q.pop]
else
p arg
end
end

harp:~ > ruby a.rb
["-v", “4”]
["-l", “log”]
“foobar”

it could be abstracted, but it’s already pretty easy to use.

regards.

-a

On 2/10/07, [email protected] [email protected] wrote:

argv = %w( -v 4 -l log foobar )

q = SizedQueue.new 1
done = Object.new.freeze
t = Thread.new{ argv.each{|arg| q.push arg}; q.push done }

Hmm. Well, I’m not used to even thinking of threads unless I’m doing
some kind of classic parallel processing problem (and I almost never
need to do those!), but I can see how that tactic could do the job
as a solution.

I’ll have to think about this and some of the other solutions offered,
and
see what “feels right” for what I’d like to do. (whatever that means!)
Thanks for the ideas.