External yielder for Enumerator

Detlef_R · March 17, 2014, 9:52am

Hello ruby friends,

I’d like to provide a way for my client code to define its own iteration
rules using yield. Here’s a sketch of the API I’m trying to achieve:

 repeater = MyRepeater.new(5) do |num|
   yield num
   yield num * 10
   yield num * 100
 end

 repeater.each do |i|
   puts i
 end
 # 5
 # 50
 # 500

Roughly, I thought the repeater class might look something like this,
simply dispatching for the passed block:

 class MyRepeater
   def initialize(start, &yielder)
     @start = start
     @yielder = yielder
   end

   def each
     @yielder.call(@start) do |i|
       yield i
     end
   end
 end

That’s not working (LocalJumpError). I know Enumerator can be defined
with a block that gets a “yielder” object, so I tried using that inside
my “each” method as well (same error). Ultimately I’d like to avoid
having to expose the yielder parameter to my client code. Any tips on
making it work?

Andrew V.

arvliet · March 17, 2014, 10:13am

On Mon, Mar 17, 2014 at 9:51 AM, Andrew V. [email protected] wrote:

class MyRepeater
end
That’s not working (LocalJumpError). I know Enumerator can be defined with a
block that gets a “yielder” object, so I tried using that inside my “each”
method as well (same error). Ultimately I’d like to avoid having to expose
the yielder parameter to my client code. Any tips on making it work?

What stops you from simply doing this?

class MyRepeater
include Enumerable

def initialize(n) @n = n end

def each
return to_enum(:each) unless block_given?
yield @n
yield @n * 10
yield @n * 100
self
end
end

Kind regards

robert

arvliet · March 17, 2014, 10:36am

On Mon, Mar 17, 2014 at 10:12 AM, Robert K.
[email protected] wrote:

end

  end
include Enumerable
end
If I understand correctly, he wants the client code of MyRepeater to
define what to yield (and how many times per iteration) to the block.
So one instance of my repeater could yield n, n10 and n100, and
another instance could yield a different set of values.

As fas as I know, it’s not possible to have a block to yield to
another block, because this only works inside methods. The only way I
can think of is to pass the proc explicitly:

2.0.0p195 :025 > class MyRepeater
2.0.0p195 :026?> def initialize n, &yielder
2.0.0p195 :027?> @n = n
2.0.0p195 :028?> @yielder = yielder
2.0.0p195 :029?> end
2.0.0p195 :030?> def each &blk
2.0.0p195 :031?> @yielder.call(@n, blk)
2.0.0p195 :032?> end
2.0.0p195 :033?> end

2.0.0p195 :043 > rep1 = MyRepeater.new(5) {|x,yielder|
yielder.call(x); yielder.call(x10); yielder.call(x100)}
2.0.0p195 :050 > rep1.each {|x| puts x}
5
50
500

I know it’s not what you wanted (need to have the client code use the
yielder object). Maybe someone comes with a better way. I can think of
another way, defining a method to yield in MyRepeater and
instance_exec’ing the block, but I think it’s too complex and can be
confusing. Also you could not use exactly “yield”, cause ruby thinks
it’s the keyword, even if you have a method of that name:

2.0.0p195 :123 > class MyRepeater
2.0.0p195 :124?> def initialize n, &yielder
2.0.0p195 :125?> @n = n; @yielder = yielder;
2.0.0p195 :126 > end
2.0.0p195 :127?> def my_yield n
2.0.0p195 :128?> @blk.call n
2.0.0p195 :129?> end
2.0.0p195 :130?> def each &blk
2.0.0p195 :131?> @blk = blk
2.0.0p195 :132?> instance_exec(@n, &@yielder)
2.0.0p195 :133?> end
2.0.0p195 :134?> end
=> nil
2.0.0p195 :135 > rep = MyRepeater.new(5) {|n| my_yield n; my_yield
n10; my_yield n100}
=> #<MyRepeater:0x00000002372330 @n=5,
@yielder=#Proc:0x000000023722e0@:135(irb)>
2.0.0p195 :136 > rep.each {|x| puts x}
5
50
500

Not sure if this is really an advantage over explicitly passing a
parameter to the block for explicit yielding. YMMV.

Regards,

Jesus.

arvliet · March 17, 2014, 6:09pm

On 14-03-17, 2:35, Jes??s Gabriel y Gal??n wrote:

If I understand correctly, he wants the client code of MyRepeater to
define what to yield (and how many times per iteration) to the block.
So one instance of my repeater could yield n, n10 and n100, and
another instance could yield a different set of values.

Yes, that’s exactly what I’m asking. I want to define the iteration
outside of my repeater class, but preferably make it look like a bare
yield like we would do inside the each method.

As fas as I know, it’s not possible to have a block to yield to
another block, because this only works inside methods. The only way I
can think of is to pass the proc explicitly:
…
I know it’s not what you wanted (need to have the client code use the
yielder object). Maybe someone comes with a better way. I can think of
another way, defining a method to yield in MyRepeater and
instance_exec’ing the block, but I think it’s too complex and can be
confusing. Also you could not use exactly “yield”, cause ruby thinks
it’s the keyword, even if you have a method of that name:

I thought of instance eval with a yield method too… but it seems like
too much of a hack and would break scoping for more things than it’s
worth.

I suspect there’s some clever trick using Fiber, the way
Enumerator::Yielder does it, but maybe that still involves passing the
ball back and forth…

Thanks for the feedback.

Andrew V.

arvliet · March 18, 2014, 1:06am

On 14-03-17, 13:25, Matthew K. wrote:

end
 repeater = [5, 50, 100]
 repeater.each do|i|
   puts i
 end
The API consumer already has the @yielder proc and the @start value, so
they could do the evaluation themself.

Is there more to it?

Yes, there is more to it of course! I reduced the problem down to
just the part I was trying to solve. In essence, the repeater needs to
do just a few things:

Hold an initial value
Provide an interface to configure HOW TO yield successive copies
Behave like a normal enum with “each”

The iteration needs to be configurable by the client code, and it needs
to receive the initial object to define the iteration and decide when to
stop. All this happens in the block.

You’ve turned my yielder into an array, where the list of items is
predetermined, and there’s no way to control it (it’s missing step 2).

I ended up using Enumerator::Yielder syntax (Enumerator.new{|y|}), so
now the interface looks like this:

 MyRepeater.new(5) do |output, num|
   output << num
   # mutate num, repeat...
 end

My original goal was to avoid the extra “yielder” block parameter and
just provide for normal yield, but this is pretty clean too. (“output”
is the Enumerator::Yielder in this example, and << is an alias for
Yielder#yield)

I also needed this repeater to call other code between yielding each
iteration. The Enumerator docs show how to do that pretty well:

(find “internal iterator”)

Andrew V.

arvliet · March 18, 2014, 2:20am

Firstly, let me preface by saying I’m not attacking. It’s an
interesting problem,
I just don’t understand (and I’d like to).

Andrew V. wrote in post #1140197:

Hold an initial value

Provide an interface to configure HOW TO yield successive copies

Behave like a normal enum with “each”

The iteration needs to be configurable by the client code, and it needs
to receive the initial object to define the iteration and decide when to
stop. All this happens in the block.

In other words, if I’m using your Repeater thingy, I:

pick a value
pick a bunch of operations to perform on that value
iterate through the results of those operations using #each

Is that not just an array?

You’ve turned my yielder into an array, where the list of items is
predetermined, and there’s no way to control it (it’s missing step 2).

Except that I’m the one that chose value n=5, and I’m also the one that
chose
operations [n,n10,n100]. I controlled all of that. I did this:

step 1:

num = 5

step 2:

repeater = []
repeater << num
repeater << num * 10
repeater << num * 100

step 3:

repeater.each {…}

… except that, in this case, I was able to do it offline, because I
already
know what 5, 510, and 5100 are (even though I wrote 100… oops).

This is where my understanding falls down. Is there another actor in
this
scenario that I’m missing? I think it would help me if you could come up
with another contrived example that shows how an array is insufficient.
Is
the initial value actually a stateful creature that you don’t want to
mutate until calling #each ?

Or, alternatively, explain why I should use your repeater class instead
of
doing this:

class Foo
  def initialize
    @num = 5
  end
  def each
    yield @num
    yield @num * 10
    yield @num * 100
  end
end

What entices me to pass the value and the effective body of #each to
your
utility class, instead of just doing it myself? What else do you add?

arvliet · March 18, 2014, 8:17am

On Tue, Mar 18, 2014 at 2:19 AM, Matthew K. [email protected]
wrote:

Firstly, let me preface by saying I’m not attacking. It’s an interesting
problem, I just don’t understand (and I’d like to).

Same here.

In other words, if I’m using your Repeater thingy, I:

pick a value

pick a bunch of operations to perform on that value

iterate through the results of those operations using #each

Is that not just an array?

The timing is different. Calculation of array elements may be
expensive and I assume Andrew wants to be able to stop iterating at
any point thus not wasting the calculations.

I think the original example was not a good model of the intended use
case. So Andrew, please provide a more realistic example where you
also clearly indicate which code originated from which role (you as
library provider, some user).

  def initialize
utility class, instead of just doing it myself? What else do you add?
Yes, I think we need more input.

Kind regards

robert

arvliet · March 18, 2014, 9:31am

On 14-03-17, 18:19, Matthew K. wrote:

Firstly, let me preface by saying I’m not attacking. It’s an interesting
problem, I just don’t understand (and I’d like to).

Cool. I’m happy to discuss it. First, forget about iterating Fixnums.

Is that not just an array?

If you know the predetermined list of items, sure it could be an array.
How about we add the following constraints:

You can’t know when to stop iterating until you fetch the last item.
It’s prohibitive to prefetch all the items into an array first, and
then traverse it.
The implementation to fetch the next item, and deciding where to stop
is handled differently for many different configurations and needs to be
passed in externally.
The iterator needs to do some internal setup before/after yielding
each item.

repeater << num
repeater << num * 10
repeater << num * 100
# step 3:
repeater.each {...}
… except that, in this case, I was able to do it offline, because I
already know what 5, 510, and 5100 are (even though I wrote 100… oops).

See, there’s too much assumed knowledge all in one place there. I need
to separate responsibilities into 3 places to hide complexity, and each
part knows its own role:

Getting some external object that we start with (It’s not a simple
Fixnum 5)
Defining how we need to iterate over it.
Abstracting that iteration as a simple “each”

Parts 1 and 2 are somewhat related; they have some knowledge &
correspondence with each other but I want to extract a common interface
for the iteration part so it can apply to anything that’s passed in.

This is where my understanding falls down. Is there another actor in
this scenario that I’m missing? I think it would help me if you could
come up with another contrived example that shows how an array is
insufficient. Is the initial value actually a stateful creature that you
don’t want to mutate until calling #each ?

Correct, it is not a single class; it is stateful; stuff happens only
once yielded to each. You get 3 for 3!

     yield @num * 100
   end
 end
What entices me to pass the value and the effective body of #each to
your utility class, instead of just doing it myself? What else do you add?

Your “each” is defined internally in your class. I need that logic to be
passed from a DSL block from a config file. How about this for a
hopefully more revealing example DSL:

 repeat do |pages, remote_source|
   loop do
     pages << remote_source
     break if remote_source.data.split("\n").size < 100
     remote_source.query_params["page"] += 1
   end
 end

This is a single example of a data source. There are many like it, but
only some use HTTP with a page query. Another one might get the number
of pages from the remote source and use 1.upto(pg), or list some files
from FTP… it all depends and it’s impossible to hard-code the
iteration logic into my data Provider or its supporting Repeater class.

Make sense?

Andrew V.

arvliet · March 17, 2014, 9:26pm

I’ve been thinking over this all nighr (in my sleep, apparently) and I
still can’t imagine a scenario where this:

On Mar 17, 2014 6:52 PM, “Andrew V.” [email protected] wrote:

# 5
# 50
# 500

is different from this:

repeater = [5, 50, 100]
repeater.each do|i|
  puts i
end

The API consumer already has the @yielder proc and the @start value, so
they could do the evaluation themself.

Is there more to it?

Matty