Consider using C(++) for low level, memory or performance critical
tasks. You can embed C in ruby.
Just open irb, create a big array like
(x=Array.new(3*10**8){0}).class
and watch your RAM usage. On my machine, I’m getting a bit more than 8
bytes (64bit wordsize) for each ‘0’ ( = size of 0.object_id + array
overhead).
Using
99999999999999999999999999999999 instead of 0 results in the same RAM
usage. (Because there is only one object, see below, and try: a=999 ;
b=999 ; a.object_id == b.object_id)
As you mentioned, you could use
(x=Array.new(3*10**7){"\x00"}).class
I get almost 54 bytes per null byte. Note that Array.new(10,[1]) will
only create one instance of the object, while Array.new(10){[1]} will
create 10 instances. To see this, try
x = Array.new(10,[0])
x[0][0] = 9
p x
Even this uses 8 byte for each array entry
(x=Array.new(3*10**8)).class
And the conclusion is, ruby needs to store at least the Object#object_id
(64bit on my machine) for
each array entry. Ruby arrays are not fit for the task you’re trying to
accomplish.
Changing to strings,
(x="\x00" * 3*10**8).class
This gives me 300 MB RAM usage, ie 1 byte for each null byte.
Getting the size of an object can be tricky in ruby; arrays store
references to objects, not the object itself. See also
https://www.ruby-forum.com/topic/156648
https://gist.github.com/camertron/2939093 is a half-guess based class to
estimate the memory used by an object. Don’t expect magic.