Forum: Ruby-core [ruby-trunk - Bug #7715][Open] Lazy enumerators should want to stay lazy.

Posted by marcandre (Marc-Andre Lafortune) (Guest)
on 2013-01-18 18:33
(Received via mailing list)
Issue #7715 has been reported by marcandre (Marc-Andre Lafortune).

----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715

Author: marcandre (Marc-Andre Lafortune)
Status: Open
Priority: Normal
Assignee:
Category: core
Target version:
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by ko1 (Koichi Sasada) (Guest)
on 2013-01-25 04:09
(Received via mailing list)
Issue #7715 has been updated by ko1 (Koichi Sasada).

Target version set to 2.0.0

Who's ball?
----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35595

Author: marcandre (Marc-Andre Lafortune)
Status: Open
Priority: Normal
Assignee:
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by marcandre (Marc-Andre Lafortune) (Guest)
on 2013-01-25 07:20
(Received via mailing list)
Issue #7715 has been updated by marcandre (Marc-Andre Lafortune).

Status changed from Open to Assigned
Assignee set to marcandre (Marc-Andre Lafortune)

I can do it, unless there are objections.
----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35625

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by shugo (Shugo Maeda) (Guest)
on 2013-01-25 10:24
(Received via mailing list)
Issue #7715 has been updated by shugo (Shugo Maeda).


marcandre (Marc-Andre Lafortune) wrote:
> I can do it, unless there are objections.

Your proposal sounds reasonable.
I guess these methods were forgotten to change when lazy was 
implemented.

----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35629

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by yhara (Yutaka HARA) (Guest)
on 2013-01-25 10:31
(Received via mailing list)
Issue #7715 has been updated by yhara (Yutaka HARA).


shugo (Shugo Maeda) wrote:
> I guess these methods were forgotten to change when lazy was implemented.

That's right :-(   I thought these methods does not need to be overriden
because they return Enumerator, but they should return Enumerator::Lazy 
for such cases.

----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35630

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by marcandre (Marc-Andre Lafortune) (Guest)
on 2013-02-04 01:15
(Received via mailing list)
Issue #7715 has been updated by marcandre (Marc-Andre Lafortune).


I believe I have found the key to resolve this issue, Lazy.new issue 
[#7248] and others.

We simply need to specialize `to_enum/enum_for` for lazy enumerators.

In the same way, RETURN_SIZED_ENUMERATOR should return a lazy 
enumerator, when called for a lazy enumerator.

With this in mind:
* Lazy.each_with_object, etc..., will correctly return lazy enumerators 
[#7715] without being overriden.
* Lazy#cycle can be removed. It no longer needs to be overriden.
* Lazy.new really has no need to accept (method, *args) and can be 
modified as proposed in [#7248]
* Any user method of Enumerable that returns an Enumerator using 
`to_enum` will conserve laziness.

None of this could create a regression, since Lazy & 
RETURN_SIZED_ENUMERATOR are both new to 2.0.0

I'm working on a patch...
----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35814

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by marcandre (Marc-Andre Lafortune) (Guest)
on 2013-02-04 08:53
(Received via mailing list)
Issue #7715 has been updated by marcandre (Marc-Andre Lafortune).


Patch almost done, which also fixes #7248

  https://github.com/marcandre/ruby/compare/marcandr...

Still missing:
- tweak inspect
- fix .lazy.size
- couple more tests

----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35826

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Posted by marcandre (Marc-Andre Lafortune) (Guest)
on 2013-02-04 23:35
(Received via mailing list)
Issue #7715 has been updated by marcandre (Marc-Andre Lafortune).


Patch updated, rdoc improved too.

Makes for a clean API for Lazy#new also, and there's even less code (~20 
loc).

I'll review the patch one last time before committing it (in about 5 
hours).
----------------------------------------
Bug #7715: Lazy enumerators should want to stay lazy.
https://bugs.ruby-lang.org/issues/7715#change-35836

Author: marcandre (Marc-Andre Lafortune)
Status: Assigned
Priority: Normal
Assignee: marcandre (Marc-Andre Lafortune)
Category: core
Target version: 2.0.0
ruby -v: r38825


I'm just waking up to the fact that many methods turn a lazy enumerator 
in a non-lazy one.

Here's an example from Benoit Daloze in [ruby-core:44151]:

lines = File.foreach('a_very_large_file').lazy
            .select {|line| line.length < 10 }
            .map {|line| line.chomp!; line }
            .each_slice(3)
            .map {|lines| lines.join(';').downcase }
            .take_while {|line| line.length > 20 }

That code will produce the right result but *will read the whole file*, 
which is not what is desired

Indeed, `each_slice` currently does not return a lazy enumerator :-(

To make the above code as intended, one must call `.lazy` right after 
the `each_slice(3)`. I feel this is dangerous and counter intuitive.

Is there a valid reason for this behavior? Otherwise, I would like us to 
consider returning a lazy enumerator for the following methods:
  (when called without a block)
    each_with_object
    each_with_index
    each_slice
    each_entry
    each_cons
  (always)
    chunk
    slice_before

The arguments are:
* fail early (much easier to realize one needs to call a final `force`, 
`to_a` or `each` than realizing that a lazy enumerator chain isn't 
actually lazy)
* easier to remember (every method normally returning an enumerator 
returns a lazy enumerator). basically this makes Lazy covariant
* I'd expect that if you get lazy at some point, you typically want to 
remain lazy until the very end
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.