Rails Camp Scaling Session notes

Bill_Lipa · November 14, 2006, 12:12am

Here are some notes from the scalability session of last week’s Rails
camp. They were entered by another session participant and are posted
at:
http://www.rubyonrailscamp.com/10%3A15%2Bsession%2B-%2Bscaling

The key points from my point of view:

the Ruby VM is sketchy, rather like the Java VM around 1997
the single threaded nature of Rails dispatch handling means we may
incur a big memory/hardware hit, for example if pages depend on remote
services with varying response time.

Nonetheless Rails is still attractive because of its elegance and
expressiveness. But keep your eyes open.

What scale applications on the Engine Y. site?

* terminology: â€˜sliceâ€™ is a virtual xen server â€“ around 30req/second
* theballot.org â€“ Ran on 2 slices. 1Gb/second of traffic. They ran

on $20/month hose before then.
* kongregate â€“ Flash game distribution site. 3 slices now. Deploying
several times a day. They plan for 67 boxes by next summer.
* They have made scaling easy, to levels equivalent to basecamp.

Scaling solutions of theirs

Start with 2 Load balancers
slices dont even have disks, mount root from external FS via GFS.
Each slice gets 5 mongrel instances. This stuff runs enginex (sp?).
Each â€˜sliceâ€™ machine stores a DB instance. There is a rails plugin
for managing writes/reads.
Use AOE raid for disk store.
Likely bottleneck is slices, not file system. Single cluster would
be 16-24 machines (which is a big web site)
On sudden spike when hosted with them, in an hour they can add
slices.
For us what we build nowâ€¦ dont need to do anything special to be
hosted by them. Itâ€™ll generally migrate easily.
Capistrano is used by them for deployment. It helps a lot.
Number 1 performance issue that they see is N+1 poorly structed
SQL problem.
attr_accessible, attr_protected is IMPORTANT
Memory usage is issue on servers. Mongrel process is at least
40Meg each. Some extreme cases are above 140Meg. Memory is cheap.
Processor usage has not been a factor. All boxes are dual processor quad
core AMDs and they are sleeping.
Donâ€™t worry about it until it is becoming a problem! Donâ€™t
preoptimize.
pennyarcade is a rail sit and it is huge.
Amount of silicon used for rails is 30% to 5x more than other
machines typically used â€“ but so what?
Statement: In the end DB limits you, not the application.

Lack of multithreading is raised as a question
Case study:

Java vs Ruby â€“ Say, 1000 simultaneous requests
* Mongrel can multithread but can back up on slow request
dispatch.
* In cases when you have to wait for things to do stuff â€“
backgrounddrb is used. This releases the lock on the worker. Also look
at â€˜merbâ€™ â€“ mongrel plus erb. First use for this is image upload.
* In a typical rails environment image upload locks process.
* Worst case â€“ 100Meg mongrel processes, 1000threads
simultaneously. Thatâ€™s 100Gig, @ 16Gig per machine makes for 8
machines… Not a big deal.

Array implementation and rails calls

* Supposedly each rails call creates 60000(!) arrays.
* There is a patch to make Array implementation quicker â€“ but it is

not accepted yet.

Problem with Ruby is some guys hobby

* At rubyconf matzâ€™s talk was underwhelming. Development way slow.
* rubinius (sp?) â€“ Interpreter would be compiled to C. And

interpreter would be written in ruby. Apparently good performance gains
have been seen.

Corporate support, etc.

* IBM hosting this
* Sun doing jruby
* See recent post on digg â€“ php eats rails for lunch? Presumably

this post: http://ohloh.net/wiki/articles/php_eats_rails

Hiring

* Hiring is about to go dot.com stupid â€“ anybody who breathes is

almost good enough.
* Hard to find good programmers who know rails and ruby
* Good interview question for them: Have you ever implemented a
binary level protocol?

Bill_Lipa · November 20, 2006, 2:46pm

Bill Lipa wrote:

services with varying response time.
* terminology: â€˜sliceâ€™ is a virtual xen server â€“ around 30req/second
Each slice gets 5 mongrel instances. This stuff runs enginex (sp?).
9. Number 1 performance issue that they see is N+1 poorly structed
machines typically used â€“ but so what?
at â€˜merbâ€™ â€“ mongrel plus erb. First use for this is image upload.
* Sun doing jruby

Bill - thank you for those notes; they made very interesting reading.
What Ezra and Tom are doing at EngineYard is truly impressive.

Keep an eye on Charles Nutter’s (JRuby) blog,

JRuby is slower than CRuby at present, but it is catching up, and has
the potential to run Rails without the N times memory overhead for N
Rails processes.

I’ve read that Litespeed tries to achieve something similar, by loading
Ruby and Rails before forking additional server processes.

regards

Justin F.

Bill_Lipa · November 20, 2006, 2:46pm

On Nov 19, 2006, at 5:15 PM, Justin F. wrote:

Bill - thank you for those notes; they made very interesting reading.
What Ezra and Tom are doing at EngineYard is truly impressive.

Thanks!

–
– Tom M., CTO
– Engine Y., Ruby on Rails Hosting
– Reliability, Ease of Use, Scalability
– (866) 518-YARD (9273)