Forum: Ruby Bug: Ruby 1.9.1 File io on Windows is slooooow

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Damjan R. (Guest)
on 2009-03-27 15:38
This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close

Tried on two machines. Vista SP1 and XP SP3. Same results.


by
TheR


P.S.
Where can I officaly report a Ruby bug?
Alex F. (Guest)
on 2009-03-27 16:20
(Received via mailing list)
Damjan R. wrote:
> time << Time.new
> "2 0.087"
I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#[](a_fixnum).

It's understandable that correct character-wise operations are much
harder for Ruby to implement efficiently on variable-byte-length
encodings like UTF-8. Unfortunately, though I was delighted to get
proper encoding support in 1.9, the performance hit on strings was a
killer on my string-intensive GUI app.

There are sometimes ways round this (eg I rewrote diff/lcs to use string
character enumerators rather than repeated expensive calls to
String#[]).

I expect the Ruby team are already aware of the problem, and I hope this
is a top priority for 1.9.2.

btw, you can report bugs at:
http://redmine.ruby-lang.org/

alex
The H. (Guest)
on 2009-03-27 17:26
Alex F. wrote:
>
> There are sometimes ways round this (eg I rewrote diff/lcs to use string
> character enumerators rather than repeated expensive calls to
> String#[]).
>
> I expect the Ruby team are already aware of the problem, and I hope this
> is a top priority for 1.9.2.

Unfortunately it appears to be a designed-in pessimization, though I
initially thought it was a mistake when I saw it.  Guido van Rossum (the
Python creator) probably did too, as he asked Matz about it at the
40-minute mark here

http://www.youtube.com/watch?v=oEkJvvGEtB4
Nobuyoshi N. (Guest)
on 2009-03-28 04:27
(Received via mailing list)
Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex F. wrote in [ruby-talk:332264]:
> I'm pretty sure from your other message that it's using Ruby 1.9 with
> UTF-8 encoded strings that kills performance. I get equivalent results
> on OS X comparing 1.8.6 to 1.9.1 on common string operations like
> String#[](a_fixnum).

Slow-read is caused by universal-newline.
Damjan R. (Guest)
on 2009-03-29 11:20
Nobuyoshi N. wrote:
> Hi,
>
> At Fri, 27 Mar 2009 23:16:38 +0900,
> Alex F. wrote in [ruby-talk:332264]:
>> I'm pretty sure from your other message that it's using Ruby 1.9 with
>> UTF-8 encoded strings that kills performance. I get equivalent results
>> on OS X comparing 1.8.6 to 1.9.1 on common string operations like
>> String#[](a_fixnum).
>
> Slow-read is caused by universal-newline.

Should'nt fread be reading binary file or should that be forced by 'rb'
parameter at file.open()

I did fast test on Ubuntu x64 Linux. And the problem does not exist.
Althow it was Ruby version 1.9.0.


by
TheR
Robert K. (Guest)
on 2009-03-29 15:06
(Received via mailing list)
2009/3/27 Damjan R. <removed_email_address@domain.invalid>:
> c = File.open('out.file','r') { |f| f.read }
> "1 0.133"
> "2 0.087"
>
> That is about 5x slower write and 500x read operation.

Do you have a virus scanner on that Windows box?  If so, how are
measurements when switched off?

Cheers

robert
Damjan R. (Guest)
on 2009-03-30 09:43
Robert K. wrote:
> 2009/3/27 Damjan R. <removed_email_address@domain.invalid>:
>> c = File.open('out.file','r') { |f| f.read }
>> "1 0.133"
>> "2 0.087"
>>
>> That is about 5x slower write and 500x read operation.
>
> Do you have a virus scanner on that Windows box?  If so, how are
> measurements when switched off?
>
> Cheers
>
> robert

I swithed it off and times are pretty much the same.

by
TheR
Roger P. (Guest)
on 2009-12-02 18:37
> ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
> "2 40.188834"
>
> ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
> "2 0.087"
>
> That is about 5x slower write and 500x read operation. Times are the
> same if I do:
> f = File.new('out.file','r')
> c = f.read
> f.close
> P.S.
> Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r
Ad G. (Guest)
on 2010-02-05 23:52
Roger P. wrote:
>
>> ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
>> "2 40.188834"
>>
>> ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
>> "2 0.087"
>>
>> That is about 5x slower write and 500x read operation. Times are the
>> same if I do:
>> f = File.new('out.file','r')
>> c = f.read
>> f.close
>> P.S.
>> Where can I officaly report a Ruby bug?
>
> Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
> Can you try with mode 'rb' and is it fast then?
> -r

Also this one is much much slower on ruby 1.9 then on ruby 1.8:
Tested on Windows XP. No virus scanner.

time = [Time.new]
n = 1000
lines = []
(1..n).each { |nr| lines << "line" }

time << Time.new
File.open('out2.file','w') { |f|
  lines.each { |l|
    f.print(l)
    f.flush
  }
}
time << Time.new

txt = ['construct','write','read']
0.upto(time.size - 2) {|i| p "#{txt[i]} #{time[i+1]-time[i]}" }


#ruby 1.9
#"construct 0.0"
#"write 22.453125"

#ruby 1.8
#"construct 0.0"
#"write 0.015"
Robert K. (Guest)
on 2010-02-06 15:45
(Received via mailing list)
On 02/05/2010 10:53 PM, Ad Gr wrote:
>>> c = f.read
> time = [Time.new]
> }
> #ruby 1.8
> #"construct 0.0"
> #"write 0.015"

How many test runs did you do?  Because of caching figures from a single
run are usually not very telling.

Kind regards

  robert
Roger P. (Guest)
on 2010-02-06 23:17
>     f.print(l)
>     f.flush

If you remove the flush it seems to work *much* faster.
This might be a bug in core.  I'd probably report it.


I think to avoid the slowdown with reads you can "hack a work around"
like

c = File.open('out.file','rb') { |f| f.read }
c.gsub!("\r\n", "\n")

But there might be a bug in there, too. [1]

-rp
[1] http://redmine.ruby-lang.org/issues/show/1332
Roger P. (Guest)
on 2010-02-15 16:28
Damjan R. wrote:
>
> This simple code shows everything:
...
> That is about 5x slower write and 500x read operation.

You can try with some patches at

http://www.ruby-forum.com/topic/204113

then it's only 36x as slow for reads :)

-r
This topic is locked and can not be replied to.