Array Creation

Hello

I was wondering why Ruby takes long time to create an array. I tried:

Start = Time.now
names = Array.new(614400000)
puts Time.now-Start

Ruby took around 308 seconds to create this large array. Can anyone tell
me why Ruby takes such a long time? Also when assigning a variable to
the array takes long time. Why is that?

On Wed, Sep 26, 2012 at 3:21 PM, Tridib B.
[email protected] wrote:

names = Array.new(614400000)

Ruby took around 308 seconds to create this large array. Can anyone tell
me why Ruby takes such a long time?

Precisely BECAUSE it’s so large! You may well have been running out
of memory, needing to swap RAM out to disk, which is very very slow.
Whatever you were intending to do with such a huge array, would
probably benefit from a good hard look at alternative algorithms,
especially focusing on dealing with only a small subset of the data at
one time.

-Dave

Dave A. wrote in post #1077665:

On Wed, Sep 26, 2012 at 3:21 PM, Tridib B.
[email protected] wrote:

names = Array.new(614400000)

Ruby took around 308 seconds to create this large array. Can anyone tell
me why Ruby takes such a long time?

Precisely BECAUSE it’s so large! You may well have been running out
of memory, needing to swap RAM out to disk, which is very very slow.
Whatever you were intending to do with such a huge array, would
probably benefit from a good hard look at alternative algorithms,
especially focusing on dealing with only a small subset of the data at
one time.

-Dave

OK, but when i declare same array with my extension using a simple
malloc() function. It takes very less time. Is there any internal
mechanism goes on when Ruby creates an array by itself?.

Thank You

On Wed, Sep 26, 2012 at 1:32 PM, Tridib B.
[email protected]wrote:

OK, but when i declare same array with my extension using a simple
malloc() function. It takes very less time. Is there any internal
mechanism goes on when Ruby creates an array by itself?.

You’re probably doing an inaccurate comparison. You’re asking Ruby to
allocate an array with 614,400,000 slots, which in C is equivalent to
the
same number of pointers, which depending on whether you’re on a 32-bit
or
64-bit host translates to either 2.4GB or 4.9GB.

In general, I’d say the fact you’re even attempting to do this is
extremely
suspect. What is your goal?

On Wed, Sep 26, 2012 at 3:47 PM, Tony A. [email protected]
wrote:

You’re probably doing an inaccurate comparison. You’re asking Ruby to
allocate an array with 614,400,000 slots, which in C is equivalent to the
same number of pointers, which depending on whether you’re on a 32-bit or
64-bit host translates to either 2.4GB or 4.9GB.

It’s worse than that. (“He’s dead, Jim!”) In Ruby it’s not just a
straightforward memory allocation. The Array object itself needs
additional setup, and for all I know there may be setup overhead for
each individual slot. A closer (but still not quite direct)
comparison would be to allocate memory for a huge string, such as:

str = ‘x’ * 614400000

Depending on your character encoding settings, this might allocate
anywhere from 0.6G to 2.4G; you can force it higher by using some
obscure 32-bit character, or of course a longer initial string or
higher number.

In general, I’d say the fact you’re even attempting to do this is extremely
suspect.

+1000!

-Dave

On Sep 26, 2012, at 12:21 , Tridib B. [email protected]
wrote:

I was wondering why Ruby takes long time to create an array. I tried:

Start = Time.now
names = Array.new(614400000)
puts Time.now-Start

Ruby took around 308 seconds to create this large array. Can anyone tell
me why Ruby takes such a long time? Also when assigning a variable to
the array takes long time. Why is that?

That’s invariably swap… MUUUCH slower to access a disk than memory.

On a machine with enough memory it takes about the same amount of time
to do equivalent work:

% /usr/bin/time -l ruby19 -e ‘Array.new(614400000)’
2.74 real 0.88 user 1.82 sys
4919554048 maximum resident set size

0 swaps

% cc -O3 -Wall -pedantic -std=c99 woot.c && /usr/bin/time -l ./a.out
2.92 real 1.22 user 1.69 sys
4915695616 maximum resident set size

0 swaps

where woot.c is:

typedef unsigned long VALUE;

int main() {
size_t count = 614400000;
VALUE * names = (VALUE*)malloc(count * sizeof(VALUE));

for(int i = 0; i < count; i++) {
  names[i] = 4;
}

free(names);

}

After reading all the responses to this, I was curious to see how it
would do on my system. It seemed to do fine.

wayne$ ruby largearray.rb
6.730365

But this is also on a 64-bit system with lots of RAM. So, I’m not sure
it’s a fair assessment of your script. And normally I wouldn’t create
such a large array. Now, maybe you were doing this to test some larger
script and wanted to see the performance, but I’ve got some pretty large
arrays in a few of my scripts and they don’t have performance problems
on any of the computers I use them on. But I don’t think they come close
to the size you created here.

Wayne

Dave A. писал 27.09.2012 01:38:

It’s worse than that. (“He’s dead, Jim!”) In Ruby it’s not just a
straightforward memory allocation. The Array object itself needs
additional setup, and for all I know there may be setup overhead for
each individual slot.

There is none. The whole slot range is allocated with one xmalloc():
http://rxr.whitequark.org/mri/source/array.c#319

On Wed, Sep 26, 2012 at 11:38 PM, Dave A.
[email protected] wrote:

each individual slot. A closer (but still not quite direct)
comparison would be to allocate memory for a huge string, such as:

str = ‘x’ * 614400000

Depending on your character encoding settings, this might allocate
anywhere from 0.6G to 2.4G; you can force it higher by using some
obscure 32-bit character, or of course a longer initial string or
higher number.

It’s yet even worse (“Aye, the haggis is in the fire for sure.”):
after the malloc you get empty memory pages which do not exist. The
OS just has to do some minor bookkeeping to remember all the reserved
address space of the process. Only after that memory is actually
accessed memory pages need to be provided. Since in absence of a
second parameter to Array.new all the Ruby Array slots need to be
initialized with 4 (Qnil) and cannot be left at 0 (Qfalse) all pages
need to be actually provided by the OS so they can be written. With
the large number of pages a lot of them are likely swapped out to disk
which is awfully slooooouw. See:

http://rxr.whitequark.org/mri/source/include/ruby/ruby.h#360

Cheers

robert

2012/9/27 Theresa S. [email protected]:

How can I get out of this mail list? I’m about to force all mail list
traffic thru my junk mail! I’ve followed the protocol & unsubscribed MANY
times!!

No you didn’t. You need to mail [email protected] with
“unsubscribe” in message body.

– Matma R.

How can I get out of this mail list? I’m about to force all mail list
traffic thru my junk mail! I’ve followed the protocol & unsubscribed
MANY
times!!

Help
Theresa

On 9/27/12 11:56 AM, “Robert K.” [email protected] wrote:

64-bit host translates to either 2.4GB or 4.9GB.
anywhere from 0.6G to 2.4G; you can force it higher by using some
need to be actually provided by the OS so they can be written. With
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Theresa S.
Business Analyst

Tukaiz | Imaging Print Interactive
a Marketing Services Production Company

O: 847.288.6918
C: 815.529.4162
E: [email protected]
W: tukaiz.com

On Sep 27, 2012, at 10:59 AM, Bartosz Dziewoński [email protected]
wrote:

2012/9/27 Theresa S. [email protected]:

How can I get out of this mail list? I’m about to force all mail list
traffic thru my junk mail! I’ve followed the protocol & unsubscribed MANY
times!!

No you didn’t. You need to mail [email protected] with
“unsubscribe” in message body.

Then you need to confirm the unsubscribe request
[email protected] will send you.

It takes four emails to unsubscribe:

To: [email protected] “unsubscribe”
From: [email protected] “do you really want to unsubscribe?”
To: [email protected] “yes I really want to unsubscribe”
From: [email protected] “ok, you are unsubscribed”