Re-map Portion of Ruby Heap Holding Optree to Save Child Server Memory?

I wonder if it is possible to reduce the memory footprint of a pack of 6
mongrels by about 70 meg.

In reading Hongli’s blog about his revisions to Ruby GC (last Monday
Matz if
he could use Hongli’s work), I was wondering at the size (in RSS) of all
the
mongrels in a cluster. It has always seemed just too big for my
C-binary
intuition. Why is so much of the memory of the parent not shared by the
children? In other words, why are all the children so bloated?
Obviously,
something prevents Copy-On-Right from being applied to far more of the
memory than I expect. (I expect the stack to be private, and a bit of
process-malloc’ed heap).
The pmap of a running mongrel says that by far the bulk of the memory
immediately following launch is mapped anon / private for
libruby.so, 23M
libnkf.so 6M for kanji
libsyck.so 7M for yaml

I guess that is because Ruby is a dynamic language, and always allows
for a
running proc to modify the optree itself (‘monkey patching’). So, in
contrast to a compiled binary where the program code is all in the text
seg,
Ruby keeps the ‘partly compiled bytecode’ optree in private heap, where
each
proc can mess with it at any time.

My question is, how important is that privacy? I know lots of modules
monkey patch as they load, but I can’t think of any cases where they
modify
the optree at runtime. I’m sure there are some, but do they happen in
99%
of real-world rails apps? I must be overlooking some common cases…?

But if runtime optree modification is not always required, would it be
reasonable to have a flag that would re-mmap() the optree portions of
the
heap to ‘read-only’ prior to the forking of child servers? The payoff
would
be that the read-only parts (which may be a couple of dozen meg) could
be
treated as COW pages, saving them from being copied into each mongrel in
a
cluster. It just annoys me to think that all these mongrels are
burdening
the system by lugging around useless private copies of the exact same
optree!

One of the arguments for threading Ruby and Rails is that server procs
are
so big. DHH has argued that proc-granularity is vastly simpler and safer
(bug-wise), and I agree. But I would like to reduce the seeming bloat
of a
pack of mongrels.

Another unrelated thought is that I’ve never written a rails app that
used
Kanji, so why can’t I compile a version of mongrel without NKF? Why
don’t
most Westerners run app servers without Kanji? The trend in Rails is
towards plug-ins for everything except the genuine core, why not apply
the
same philosophy to app servers?

I’ve always wondered the same thing–why does rails use so very much
memory? Is it possible to cut it down?
As a note I’ve tried some gc.c tweaks before (like changing default heap
size allocation, etc.) and they only reduced memory along the lines of
running console normal
25MB, takes 15.5s to start
running console with gc optimized ruby
23MB, takes 13.5s to start
along those lines. So it may well be rails’ fault. I should cross post
this in the rails group and see if they know :slight_smile:
Take care.
-Roger