Forum: Ruby Array Creation

Posted by Tridib Bandopadhyay (tridib04)
on 2012-09-26 21:21
Hello

I was wondering why Ruby takes long time to create an array. I tried:

Start = Time.now
names = Array.new(614400000)
puts Time.now-Start

Ruby took around 308 seconds to create this large array. Can anyone tell
me why Ruby takes such a long time? Also when assigning a variable to
the array takes long time. Why is that?
Posted by Dave Aronson (Guest)
on 2012-09-26 21:30
(Received via mailing list)
On Wed, Sep 26, 2012 at 3:21 PM, Tridib Bandopadhyay
<lists@ruby-forum.com> wrote:

> names = Array.new(614400000)
...
> Ruby took around 308 seconds to create this large array. Can anyone tell
> me why Ruby takes such a long time?

Precisely BECAUSE it's so large!  You may well have been running out
of memory, needing to swap RAM out to disk, which is very very slow.
Whatever you were intending to do with such a huge array, would
probably benefit from a good hard look at alternative algorithms,
especially focusing on dealing with only a small subset of the data at
one time.

-Dave
Posted by Tridib Bandopadhyay (tridib04)
on 2012-09-26 21:32
Dave Aronson wrote in post #1077665:
> On Wed, Sep 26, 2012 at 3:21 PM, Tridib Bandopadhyay
> <lists@ruby-forum.com> wrote:
>
>> names = Array.new(614400000)
> ...
>> Ruby took around 308 seconds to create this large array. Can anyone tell
>> me why Ruby takes such a long time?
>
> Precisely BECAUSE it's so large!  You may well have been running out
> of memory, needing to swap RAM out to disk, which is very very slow.
> Whatever you were intending to do with such a huge array, would
> probably benefit from a good hard look at alternative algorithms,
> especially focusing on dealing with only a small subset of the data at
> one time.
>
> -Dave

OK, but when i declare same array with my extension using a simple 
malloc() function. It takes very less time. Is there any internal 
mechanism goes on when Ruby creates an array by itself?.

Thank You
Posted by Tony Arcieri (Guest)
on 2012-09-26 21:47
(Received via mailing list)
On Wed, Sep 26, 2012 at 1:32 PM, Tridib Bandopadhyay
<lists@ruby-forum.com>wrote:

> OK, but when i declare same array with my extension using a simple
> malloc() function. It takes very less time. Is there any internal
> mechanism goes on when Ruby creates an array by itself?.


You're probably doing an inaccurate comparison. You're asking Ruby to
allocate an array with 614,400,000 slots, which in C is equivalent to 
the
same number of pointers, which depending on whether you're on a 32-bit 
or
64-bit host translates to either 2.4GB or 4.9GB.

In general, I'd say the fact you're even attempting to do this is 
extremely
suspect. What is your goal?
Posted by Dave Aronson (Guest)
on 2012-09-26 23:39
(Received via mailing list)
On Wed, Sep 26, 2012 at 3:47 PM, Tony Arcieri <tony.arcieri@gmail.com> 
wrote:

> You're probably doing an inaccurate comparison. You're asking Ruby to
> allocate an array with 614,400,000 slots, which in C is equivalent to the
> same number of pointers, which depending on whether you're on a 32-bit or
> 64-bit host translates to either 2.4GB or 4.9GB.

It's worse than that.  ("He's dead, Jim!")  In Ruby it's not just a
straightforward memory allocation.  The Array object itself needs
additional setup, and for all I know there may be setup overhead for
each individual slot.  A closer (but still not quite direct)
comparison would be to allocate memory for a huge string, such as:

  str = 'x' * 614400000

Depending on your character encoding settings, this might allocate
anywhere from 0.6G to 2.4G; you can force it higher by using some
obscure 32-bit character, or of course a longer initial string or
higher number.

> In general, I'd say the fact you're even attempting to do this is extremely
> suspect.

+1000!

-Dave
Posted by Ryan Davis (Guest)
on 2012-09-27 02:23
(Received via mailing list)
On Sep 26, 2012, at 12:21 , Tridib Bandopadhyay <lists@ruby-forum.com> 
wrote:

> I was wondering why Ruby takes long time to create an array. I tried:
>
> Start = Time.now
> names = Array.new(614400000)
> puts Time.now-Start
>
> Ruby took around 308 seconds to create this large array. Can anyone tell
> me why Ruby takes such a long time? Also when assigning a variable to
> the array takes long time. Why is that?


That's invariably swap... MUUUCH slower to access a disk than memory.

On a machine with enough memory it takes about the same amount of time 
to do equivalent work:

  % /usr/bin/time -l ruby19 -e 'Array.new(614400000)'
          2.74 real         0.88 user         1.82 sys
  4919554048  maximum resident set size
  ...
           0  swaps
  ...

  % cc -O3 -Wall -pedantic -std=c99 woot.c && /usr/bin/time -l ./a.out
  2.92 real 1.22 user 1.69 sys
  4915695616  maximum resident set size
  ...
           0  swaps
  ...

where woot.c is:

  typedef unsigned long VALUE;

  int main() {
    size_t count = 614400000;
    VALUE * names = (VALUE*)malloc(count * sizeof(VALUE));

    for(int i = 0; i < count; i++) {
      names[i] = 4;
    }

    free(names);
  }
Posted by Peter Zotov (Guest)
on 2012-09-27 07:54
(Received via mailing list)
Dave Aronson писал 27.09.2012 01:38:
>
> It's worse than that.  ("He's dead, Jim!")  In Ruby it's not just a
> straightforward memory allocation.  The Array object itself needs
> additional setup, and for all I know there may be setup overhead for
> each individual slot.

There is none. The whole slot range is allocated with one xmalloc():
   http://rxr.whitequark.org/mri/source/array.c#319
Posted by Wayne Brissette (Guest)
on 2012-09-27 11:32
(Received via mailing list)
After reading all the responses to this, I was curious to see how it 
would do on my system. It seemed to do fine.

wayne$ ruby largearray.rb
6.730365

But this is also on a 64-bit system with lots of RAM. So, I'm not sure 
it's a fair assessment of your script.  And normally I wouldn't create 
such a large array. Now, maybe you were doing this to test some larger 
script and wanted to see the performance, but I've got some pretty large 
arrays in a few of my scripts and they don't have performance problems 
on any of the computers I use them on. But I don't think they come close 
to the size you created here.

Wayne
Posted by Robert Klemme (robert_k78)
on 2012-09-27 18:57
(Received via mailing list)
On Wed, Sep 26, 2012 at 11:38 PM, Dave Aronson
<rubytalk2dave@davearonson.com> wrote:
> each individual slot.  A closer (but still not quite direct)
> comparison would be to allocate memory for a huge string, such as:
>
>   str = 'x' * 614400000
>
> Depending on your character encoding settings, this might allocate
> anywhere from 0.6G to 2.4G; you can force it higher by using some
> obscure 32-bit character, or of course a longer initial string or
> higher number.

It's yet even worse ("Aye, the haggis is in the fire for sure."):
after the malloc you get empty memory pages which do not exist.  The
OS just has to do some minor bookkeeping to remember all the reserved
address space of the process.  Only after that memory is actually
accessed memory pages need to be provided.  Since in absence of a
second parameter to Array.new all the Ruby Array slots need to be
initialized with 4 (Qnil) and cannot be left at 0 (Qfalse) all pages
need to be actually provided by the OS so they can be written.  With
the large number of pages a lot of them are likely swapped out to disk
which is awfully slooooouw.  See:

http://rxr.whitequark.org/mri/source/include/ruby/ruby.h#360

Cheers

robert
Posted by Theresa Strepek (Guest)
on 2012-09-27 19:51
(Received via mailing list)
How can I get out of this mail list?  I'm about to force all mail list
traffic thru my junk mail!  I've followed the protocol & unsubscribed 
MANY
times!!

Help
Theresa

On 9/27/12 11:56 AM, "Robert Klemme" <shortcutter@googlemail.com> wrote:

>>> 64-bit host translates to either 2.4GB or 4.9GB.
>> anywhere from 0.6G to 2.4G; you can force it higher by using some
>need to be actually provided by the OS so they can be written.  With
>remember.guy do |as, often| as.you_can - without end
>http://blog.rubybestpractices.com/
>


Theresa Strepek
Business Analyst

Tukaiz | Imaging Print Interactive
a Marketing Services Production Company

O:      847.288.6918
C:      815.529.4162
E:      t.strepek@tukaiz.com
W:      tukaiz.com
Posted by Bartosz Dziewoński (matmarex)
on 2012-09-27 19:59
(Received via mailing list)
2012/9/27 Theresa Strepek <t.strepek@tukaiz.com>:
> How can I get out of this mail list?  I'm about to force all mail list
> traffic thru my junk mail!  I've followed the protocol & unsubscribed MANY
> times!!

No you didn't. You need to mail ruby-talk-ctl@ruby-lang.org with
"unsubscribe" in message body.

-- Matma Rex
Posted by Eric Hodel (Guest)
on 2012-09-28 00:13
(Received via mailing list)
On Sep 27, 2012, at 10:59 AM, Bartosz Dziewoński <matma.rex@gmail.com> 
wrote:
> 2012/9/27 Theresa Strepek <t.strepek@tukaiz.com>:
>> How can I get out of this mail list?  I'm about to force all mail list
>> traffic thru my junk mail!  I've followed the protocol & unsubscribed MANY
>> times!!
>
> No you didn't. You need to mail ruby-talk-ctl@ruby-lang.org with
> "unsubscribe" in message body.

Then you need to confirm the unsubscribe request 
ruby-talk-ctl@ruby-lang.org will send you.

It takes four emails to unsubscribe:

To: ruby-talk-ctl@ruby-lang.org "unsubscribe"
From: ruby-talk-ctl@ruby-lang.org "do you really want to unsubscribe?"
To: ruby-talk-ctl@ruby-lang.org "yes I really want to unsubscribe"
From: ruby-talk-ctl@ruby-lang.org "ok, you are unsubscribed"
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.