Forum: Rails deployment Highwire, Ruby Load Balancer

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Eddy (Guest)
on 2007-05-06 05:05
After stumbling across the post below, i was playing around with it, but
noticed if there was ever a bug in the app, the whole domain would
freeze up. Is there anyone else using this out there? I just tried
emailing him..we'll see if support is still around..

Paul B. paul at paulbutcher.com
Wed Sep 20 06:18:53 EDT 2006

    * Previous message: [Mongrel] [Slightly OT] Uploading files with
firefox
    * Next message: [Mongrel] Why Rails + mongrel_cluster + load
balancing doesn't work for us and the beginning of a solution
    * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

We have been searching for a Rails deployment architecture which works
for
us for some time. We've recently moved from Apache 1.3 + FastCGI to
Apache
2.2 + mod_proxy_balancer + mongrel_cluster, and it's a significant
improvement. But it still exhibits serious performance problems.

We have the beginnings of a fix that we would like to share.

To illustrate the problem, imagine a 2 element mongrel cluster running a
Rails app containing the following simple controller:

  class HomeController < ApplicationController
    def fast
      sleep 1
      render :text => "I'm fast"
    end

    def slow
      sleep 10
      render :text => "I'm slow"
    end
  end

and the following test app

  #!/usr/bin/env ruby
  require File.dirname(__FILE__) + '/config/boot'
  require File.dirname(__FILE__) + '/config/environment'

  end_time = 1.minute.from_now

  fast_count = 0
  slow_count = 0

  fastthread = Thread.start do
    while Time.now < end_time do
      Net::HTTP.get 'localhost', '/home/fast'
      fast_count += 1
    end
  end

  slowthread = Thread.start do
    while Time.now < end_time do
      Net::HTTP.get 'localhost', '/home/slow'
      slow_count += 1
    end
  end

  fastthread.join
  slowthread.join

  puts "Fast: #{fast_count}"
  puts "Slow: #{slow_count}"

In this scenario, there will be two requests outstanding at any time,
one
"fast" and one "slow". You would expect approximately 60 fast and 6 slow
GETs to complete over the course of a minute. This is not what happens;
approximately 12 fast and 6 slow GETs complete per minute.

The reason is that mod_proxy_balancer assumes that it can send multiple
requests to each mongrel and fast requests end up waiting for slow
requests,
even if there is an idle mongrel server available.

We've experimented with various different configurations for
mod_proxy_balancer without successfully solving this issue. As far as we
can
tell, all other popular load balancers (Pound, Pen, balance) behave in
roughly the same way.

This is causing us real problems. Our user interface is very
time-sensitive.
For common user actions, a page refresh delay of more than a couple of
seconds is unacceptable. What we're finding is that if we have (say) a
reporting page which takes 10 seconds to display (an entirely acceptable
delay for a rarely-used report) then our users are seeing similar delays
on
pages which should be virtually instantaneous (and would be, if their
requests were directed to idle servers). Worse, we're occasionally
seeing
unnecessary timeouts because requests are queuing up on one server.

The real solution to the problem would be to remove Rails' inability to
handle more than one thread. In the absence of that solution, however,
we've
implemented (in Ruby) what might be the world's smallest load-balancer.
It
only ever sends a single request to each member of the cluster at a
time.
It's called HighWire and is available on RubyForge (no Gem yet - it's on
the
list of things to do!):

  svn checkout svn://rubyforge.org/var/svn/highwire

Using this instead of mod_proxy_balancer, and running the same test
script
above, we see approximately 54 fast and 6 slow requests per minute.

HighWire is very young and has a way to go. It's not had any serious
optimization or testing, and there are a bunch of things that need doing
before it can really be considered production ready. But it does work
for
us, and does produce a significant performance improvement.

Please check it out and let us know what you think.
Luis L. (Guest)
on 2007-05-06 08:24
(Received via mailing list)
On 5/5/07, Eddy <removed_email_address@domain.invalid> wrote:
[...]

>       sleep 10
>   end_time = 1.minute.from_now
>
>   puts "Fast: #{fast_count}"
>   puts "Slow: #{slow_count}"
>

Ok, first things first:

sleep is not good on "threaded" ruby applications. Long numbers froze
the whole VM, not just the thread involved.

Also, a rails app is locked inside a big mutex to solve issues around
thread-safety (better name it unsafety) of Rails. So, any incoming
connection that require been served by Rails dispatcher will bet into
the queue.

>
Most of the named "load-balancer" behave like that: round-robin
balancing. Even if you could weight them, they are strict on that and
not adapt well through time.

> We've experimented with various different configurations for
> mod_proxy_balancer without successfully solving this issue. As far as we
> can
> tell, all other popular load balancers (Pound, Pen, balance) behave in
> roughly the same way.

From my point of view, they should learn about timmings from each
member of the cluster and recalculate the weight they could handle.

> unnecessary timeouts because requests are queuing up on one server.
>

Maybe you could "switch" over a lightweight solution, that partially
cover these problems (Mongrel + erb, do a google for it) ;-)

> The real solution to the problem would be to remove Rails' inability to
> handle more than one thread.

That is a real problem: A lot of parts of Rails lack aren't
thread-safe, adapting them will require a huge amount of work, but I
agree is worth.

>   svn checkout svn://rubyforge.org/var/svn/highwire
>

Haven't checked the code (yet), but sounds interesting. Also if there
will be a loading strategy, that could be configurable (maybe via
callbacks or something) that allow you change how loading will work.

> Please check it out and let us know what you think.
>

Excellent news, thanks for sharing it with us.

--
Luis L.
Multimedia systems
-
Leaders are made, they are not born. They are made by hard effort,
which is the price which all of us must pay to achieve any goal that
is worthwhile.
Vince Lombardi
Eddy (Guest)
on 2007-05-06 22:13
> btw, I'm not surprised highwire isn't talked about more.

A response from the Highwire guy:
>>

I'm afraid that we haven't been using Highwire for a while now. That
doesn't mean that the problem Highwire was designed to address
doesn't still exist (it very definitely does), but we have decided to
follow a different solution. We have divided our mongrel cluster into
two halves - a "normal" cluster on which the common "fast" operations
take place and an "admin" cluster on which occasional "slow" things
take place. Very soon, we plan to take this further and move the
"slow" cluster onto an entirely separate server.

That means, I'm afraid, that we haven't got a patch for the problem
you mention because we've not developed it any further. Having said
that, Highwire really couldn't be simpler, so if you want to take on
creating a patch, or even ownership of the Highwire project, please
be my guest! Let me know if you're interested.

BTW - you might be interested in a couple of blog articles we've
recently written on Ruby on Rails:

http://about.82ask.com/news/wizardry/

------------------------------------------------
Paul B.
CTO
82ASK
Mobile: +44 (0) 7740 857648
Main: +44 (0) 1223 309080
Fax: +44(0) 1223 309082
Email: removed_email_address@domain.invalid
MSN: removed_email_address@domain.invalid
AIM: paulrabutcher
Skype: paulrabutcher
LinkedIn: http://www.linkedin.com/in/paulbutcher
------------------------------------------------

Luis L. wrote:
> On 5/5/07, Eddy <removed_email_address@domain.invalid> wrote:
> [...]
>
>>       sleep 10
>>   end_time = 1.minute.from_now
>>
>>   puts "Fast: #{fast_count}"
>>   puts "Slow: #{slow_count}"
>>
>
> Ok, first things first:
>
> sleep is not good on "threaded" ruby applications. Long numbers froze
> the whole VM, not just the thread involved.
>
> Also, a rails app is locked inside a big mutex to solve issues around
> thread-safety (better name it unsafety) of Rails. So, any incoming
> connection that require been served by Rails dispatcher will bet into
> the queue.
>
>>
> Most of the named "load-balancer" behave like that: round-robin
> balancing. Even if you could weight them, they are strict on that and
> not adapt well through time.
>
>> We've experimented with various different configurations for
>> mod_proxy_balancer without successfully solving this issue. As far as we
>> can
>> tell, all other popular load balancers (Pound, Pen, balance) behave in
>> roughly the same way.
>
> From my point of view, they should learn about timmings from each
> member of the cluster and recalculate the weight they could handle.
>
>> unnecessary timeouts because requests are queuing up on one server.
>>
>
> Maybe you could "switch" over a lightweight solution, that partially
> cover these problems (Mongrel + erb, do a google for it) ;-)
>
>> The real solution to the problem would be to remove Rails' inability to
>> handle more than one thread.
>
> That is a real problem: A lot of parts of Rails lack aren't
> thread-safe, adapting them will require a huge amount of work, but I
> agree is worth.
>
>>   svn checkout svn://rubyforge.org/var/svn/highwire
>>
>
> Haven't checked the code (yet), but sounds interesting. Also if there
> will be a loading strategy, that could be configurable (maybe via
> callbacks or something) that allow you change how loading will work.
>
>> Please check it out and let us know what you think.
>>
>
> Excellent news, thanks for sharing it with us.
>
> --
> Luis L.
> Multimedia systems
> -
> Leaders are made, they are not born. They are made by hard effort,
> which is the price which all of us must pay to achieve any goal that
> is worthwhile.
> Vince Lombardi
Michael W. (Guest)
on 2007-06-26 23:11
Just a note for anyone else, like me, who kept hitting this forum entry
while googling around.  We replaced  highwire successfully with balance
(http://www.inlab.de/balanceng/).

Had to change one of the default build constants (MAXCHANNELS) from 16
to 32, but after which its' working great against our pool of 10
mongrels.


Running as follows configures it to feed one request at a time round
robin t each free mongrel instance unless they're all concurrently busy
at which point it switches to sending them round robin to all 10 a-la
mod_proxy_balancer.

 /usr/local/bin/balance -M -b 127.0.0.1 7050 127.0.0.1:8000:1
127.0.0.1:8001:1 127.0.0.1:8002:1 127.0.0.1:8003:1 127.0.0.1:8004:1
127.0.0.1:8005:1 127.0.0.1:8006:1 127.0.0.1:8007:1 127.0.0.1:8008:1
127.0.0.1:8009:1 ! 127.0.0.1:8000 127.0.0.1:8001 127.0.0.1:8002
127.0.0.1:8003 127.0.0.1:8004 127.0.0.1:8005 127.0.0.1:8006
127.0.0.1:8007 127.0.0.1:8008 127.0.0.1:8009

So far its' been working very well.  Status from first afternoon of use
are as follows:
# /usr/local/etc/rc.d/balance status
balance at 127.0.0.1:7050
GRP Type  #   S       ip-address  port    c      totalc maxc        sent
rcvd
  0   RR  0 ENA        127.0.0.1  8000    0        1972    1     1534959
17503343
  0   RR  1 ENA        127.0.0.1  8001    0        2122    1     1604954
38727099
  0   RR  2 ENA        127.0.0.1  8002    0        2220    1     6587106
21329912
  0   RR  3 ENA        127.0.0.1  8003    1        1412    1     6868618
16225686
  0   RR  4 ENA        127.0.0.1  8004    0        1952    1     2376831
21449204
  0   RR  5 ENA        127.0.0.1  8005    1        1564    1    12380050
18871952
  0   RR  6 ENA        127.0.0.1  8006    0        1894    1     5663904
21877505
  0   RR  7 ENA        127.0.0.1  8007    0        2025    1    22239136
20035666
  0   RR  8 ENA        127.0.0.1  8008    1        1787    1     8410442
21476800
  0   RR  9 ENA        127.0.0.1  8009    0        1914    1    11913518
18254808
  1   RR  0 ENA        127.0.0.1  8000    0         231    0      116116
1439980
  1   RR  1 ENA        127.0.0.1  8001    0         232    0      150685
2026796
  1   RR  2 ENA        127.0.0.1  8002    0         232    0      115881
638747
  1   RR  3 ENA        127.0.0.1  8003    0         231    0      117039
1072487
  1   RR  4 ENA        127.0.0.1  8004    0         233    0      121611
1491177
  1   RR  5 ENA        127.0.0.1  8005    0         233    0      120231
1162390
  1   RR  6 ENA        127.0.0.1  8006    0         231    0     4665602
1843309
  1   RR  7 ENA        127.0.0.1  8007    0         229    0      118730
1373138
  1   RR  8 ENA        127.0.0.1  8008    0         229    0      114949
1497304
  1   RR  9 ENA        127.0.0.1  8009    0         230    0      119177
1224900




Eddy wrote:http://www.inlab.de/balanceng/
>> btw, I'm not surprised highwire isn't talked about more.
>
> A response from the Highwire guy:
>>>
>
> I'm afraid that we haven't been using Highwire for a while now. That
> doesn't mean that the problem Highwire was designed to address
> doesn't still exist (it very definitely does), but we have decided to
> follow a different solution. We have divided our mongrel cluster into
> two halves - a "normal" cluster on which the common "fast" operations
> take place and an "admin" cluster on which occasional "slow" things
> take place. Very soon, we plan to take this further and move the
> "slow" cluster onto an entirely separate server.
>
> That means, I'm afraid, that we haven't got a patch for the problem
> you mention because we've not developed it any further. Having said
> that, Highwire really couldn't be simpler, so if you want to take on
> creating a patch, or even ownership of the Highwire project, please
> be my guest! Let me know if you're interested.
>
> BTW - you might be interested in a couple of blog articles we've
> recently written on Ruby on Rails:
>
> http://about.82ask.com/news/wizardry/
>
> ------------------------------------------------
> Paul B.
> CTO
> 82ASK
> Mobile: +44 (0) 7740 857648
> Main: +44 (0) 1223 309080
> Fax: +44(0) 1223 309082
> Email: removed_email_address@domain.invalid
> MSN: removed_email_address@domain.invalid
> AIM: paulrabutcher
> Skype: paulrabutcher
> LinkedIn: http://www.linkedin.com/in/paulbutcher
> ------------------------------------------------
>
This topic is locked and can not be replied to.