Forum: Ruby-core [ruby-trunk - Feature #6047][Assigned] read_all: Grow buffer exponentially in generic case

Posted by Martin Bosslet (martin_b)
on 2012-02-19 21:48
(Received via mailing list)
Issue #6047 has been reported by Martin Bosslet.

----------------------------------------
Feature #6047: read_all: Grow buffer exponentially in generic case
https://bugs.ruby-lang.org/issues/6047

Author: Martin Bosslet
Status: Assigned
Priority: Normal
Assignee: Martin Bosslet
Category: core
Target version: 2.0.0


In the general case, read_all grows its buffer linearly by just the 
amount that is currently read from the underlying source. This results 
in a linear number of reallocs, It might turn out beneficial if the 
buffer were grown exponentially by multiplying with a constant factor 
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of 
reallocs.

I will provide a patch and benchmarks, but I'm already opening this 
issue so I won't forget.

See also https://bugs.ruby-lang.org/issues/5353 for more details.
Posted by ko1 (Koichi Sasada) (Guest)
on 2012-10-26 23:39
(Received via mailing list)
Issue #6047 has been updated by ko1 (Koichi Sasada).


ping. status?
Do you need helps or comments?

----------------------------------------
Feature #6047: read_all: Grow buffer exponentially in generic case
https://bugs.ruby-lang.org/issues/6047#change-31677

Author: MartinBosslet (Martin Bosslet)
Status: Assigned
Priority: Normal
Assignee: MartinBosslet (Martin Bosslet)
Category: core
Target version: 2.0.0


In the general case, read_all grows its buffer linearly by just the 
amount that is currently read from the underlying source. This results 
in a linear number of reallocs, It might turn out beneficial if the 
buffer were grown exponentially by multiplying with a constant factor 
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of 
reallocs.

I will provide a patch and benchmarks, but I'm already opening this 
issue so I won't forget.

See also https://bugs.ruby-lang.org/issues/5353 for more details.
Posted by Martin Bosslet (martin_b)
on 2012-11-21 03:49
(Received via mailing list)
Issue #6047 has been updated by MartinBosslet (Martin Bosslet).


ko1 (Koichi Sasada) wrote:
> ping. status?
> Do you need helps or comments?

Thanks for your help, to be honest, I haven't tried so far. Can we leave 
it at 2.0.0 target for now? If I run into problems, I'll ask here!

----------------------------------------
Feature #6047: read_all: Grow buffer exponentially in generic case
https://bugs.ruby-lang.org/issues/6047#change-33374

Author: MartinBosslet (Martin Bosslet)
Status: Assigned
Priority: Normal
Assignee: MartinBosslet (Martin Bosslet)
Category: core
Target version: 2.0.0


In the general case, read_all grows its buffer linearly by just the 
amount that is currently read from the underlying source. This results 
in a linear number of reallocs, It might turn out beneficial if the 
buffer were grown exponentially by multiplying with a constant factor 
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of 
reallocs.

I will provide a patch and benchmarks, but I'm already opening this 
issue so I won't forget.

See also https://bugs.ruby-lang.org/issues/5353 for more details.
Posted by Eric Wong (Guest)
on 2012-11-21 04:43
(Received via mailing list)
Martin Bosslet <Martin.Bosslet@googlemail.com> wrote:
> In the general case, read_all grows its buffer linearly by just the
> amount that is currently read from the underlying source. This results
> in a linear number of reallocs, It might turn out beneficial if the
> buffer were grown exponentially by multiplying with a constant factor
> (e.g. 1.5 or 2), thus resulting in only a logarithmic numver of
> reallocs.

I think growing the buffer exponentially makes sense.

I would enforce a hard limit (probably <= 8 MB) for each growth,
to:

1) discourage read_all() for large files, it's very wasteful and
   usually hurts performance

2) prevent memory exhaustion for edge cases (especially on 32-bit)
Posted by mame (Yusuke Endoh) (Guest)
on 2012-11-24 03:48
(Received via mailing list)
Issue #6047 has been updated by mame (Yusuke Endoh).

Target version changed from 2.0.0 to next minor

My experience also shows that it is useless to open a ticket for a 
reminder to myself :-)

I'm setting to next minor tentatively, but if it is really just a 
performance improvement (i.e., it affects no external modules), you can 
commit it to 2.0.0 before code freeze.

--
Yusuke Endoh <mame@tsg.ne.jp>
----------------------------------------
Feature #6047: read_all: Grow buffer exponentially in generic case
https://bugs.ruby-lang.org/issues/6047#change-33749

Author: MartinBosslet (Martin Bosslet)
Status: Assigned
Priority: Normal
Assignee: MartinBosslet (Martin Bosslet)
Category: core
Target version: next minor


In the general case, read_all grows its buffer linearly by just the 
amount that is currently read from the underlying source. This results 
in a linear number of reallocs, It might turn out beneficial if the 
buffer were grown exponentially by multiplying with a constant factor 
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of 
reallocs.

I will provide a patch and benchmarks, but I'm already opening this 
issue so I won't forget.

See also https://bugs.ruby-lang.org/issues/5353 for more details.
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.