Bug: Ruby 1.9.1 File io on Windows is slooooow


#1

This simple code shows everything:

time = [Time.new]
c = ‘’
‘aaaa’.upto(‘zzzz’) {|e| c << e}
3.times { c << c }
time << Time.new
File.open(‘out.file’,‘w’) { |f| f.write© }
time << Time.new
c = File.open(‘out.file’,‘r’) { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p “#{i} #{time[i+1]-time[i]}” }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
“0 0.537075”
“1 0.696244”
“2 40.188834”

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
“0 0.551”
“1 0.133”
“2 0.087”

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new(‘out.file’,‘r’)
c = f.read
f.close

Tried on two machines. Vista SP1 and XP SP3. Same results.

by
TheR

P.S.
Where can I officaly report a Ruby bug?


#2

Damjan R. wrote:

time << Time.new
“2 0.087”
I’m pretty sure from your other message that it’s using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#.

It’s understandable that correct character-wise operations are much
harder for Ruby to implement efficiently on variable-byte-length
encodings like UTF-8. Unfortunately, though I was delighted to get
proper encoding support in 1.9, the performance hit on strings was a
killer on my string-intensive GUI app.

There are sometimes ways round this (eg I rewrote diff/lcs to use string
character enumerators rather than repeated expensive calls to
String#[]).

I expect the Ruby team are already aware of the problem, and I hope this
is a top priority for 1.9.2.

btw, you can report bugs at:
http://redmine.ruby-lang.org/

alex


#3

Alex F. wrote:

There are sometimes ways round this (eg I rewrote diff/lcs to use string
character enumerators rather than repeated expensive calls to
String#[]).

I expect the Ruby team are already aware of the problem, and I hope this
is a top priority for 1.9.2.

Unfortunately it appears to be a designed-in pessimization, though I
initially thought it was a mistake when I saw it. Guido van Rossum (the
Python creator) probably did too, as he asked Matz about it at the
40-minute mark here

http://www.youtube.com/watch?v=oEkJvvGEtB4


#4

Nobuyoshi N. wrote:

Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex F. wrote in [ruby-talk:332264]:

I’m pretty sure from your other message that it’s using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#.

Slow-read is caused by universal-newline.

Should’nt fread be reading binary file or should that be forced by ‘rb’
parameter at file.open()

I did fast test on Ubuntu x64 Linux. And the problem does not exist.
Althow it was Ruby version 1.9.0.

by
TheR


#5

Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex F. wrote in [ruby-talk:332264]:

I’m pretty sure from your other message that it’s using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#.

Slow-read is caused by universal-newline.


#6

2009/3/27 Damjan R. removed_email_address@domain.invalid:

c = File.open(‘out.file’,‘r’) { |f| f.read }
“1 0.133”
“2 0.087”

That is about 5x slower write and 500x read operation.

Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?

Cheers

robert


#7

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
“2 40.188834”

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
“2 0.087”

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new(‘out.file’,‘r’)
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode ‘rb’ and is it fast then?
-r


#8

Robert K. wrote:

2009/3/27 Damjan R. removed_email_address@domain.invalid:

c = File.open(‘out.file’,‘r’) { |f| f.read }
“1 0.133”
“2 0.087”

That is about 5x slower write and 500x read operation.

Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?

Cheers

robert

I swithed it off and times are pretty much the same.

by
TheR


#9

Roger P. wrote:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
“2 40.188834”

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
“2 0.087”

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new(‘out.file’,‘r’)
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode ‘rb’ and is it fast then?
-r

Also this one is much much slower on ruby 1.9 then on ruby 1.8:
Tested on Windows XP. No virus scanner.

time = [Time.new]
n = 1000
lines = []
(1…n).each { |nr| lines << “line” }

time << Time.new
File.open(‘out2.file’,‘w’) { |f|
lines.each { |l|
f.print(l)
f.flush
}
}
time << Time.new

txt = [‘construct’,‘write’,‘read’]
0.upto(time.size - 2) {|i| p “#{txt[i]} #{time[i+1]-time[i]}” }

#ruby 1.9
#“construct 0.0”
#“write 22.453125”

#ruby 1.8
#“construct 0.0”
#“write 0.015”


#10
f.print(l)
f.flush

If you remove the flush it seems to work much faster.
This might be a bug in core. I’d probably report it.

I think to avoid the slowdown with reads you can “hack a work around”
like

c = File.open(‘out.file’,‘rb’) { |f| f.read }
c.gsub!("\r\n", “\n”)

But there might be a bug in there, too. [1]

-rp
[1] http://redmine.ruby-lang.org/issues/show/1332


#11

On 02/05/2010 10:53 PM, Ad Gr wrote:

c = f.read
time = [Time.new]
}
#ruby 1.8
#“construct 0.0”
#“write 0.015”

How many test runs did you do? Because of caching figures from a single
run are usually not very telling.

Kind regards

robert


#12

Damjan R. wrote:

This simple code shows everything:

That is about 5x slower write and 500x read operation.

You can try with some patches at

http://www.ruby-forum.com/topic/204113

then it’s only 36x as slow for reads :slight_smile:

-r