Recent Criticism about Ruby (Scalability, etc.)

casper_the_ghost · September 27, 2007, 12:40am

On Tue, Sep 25, 2007 at 11:01:43PM +0900, Glenn G. wrote:

So yeah, get out your checkbooks and write more checks for more
servers and sure Ruby scales just fine !

Ruby Rocks !

2 years x 1 developer @ $70k = 58x Dell PowerEdge 860 Quad Core Xeon
X3210s

Job Security Rocks!

Oops. I should have read the responses before I posted my own cost
comparison.

casper_the_ghost · September 27, 2007, 7:18am

Ruby M. wrote:

that hardware ! (*Just forget the fact that another language might
have saved some money in compute gear. I mean, don’t even think about
this at-all. Why are you still reading this sentence? I told you to
forget about what those 58 server cost… Money is irrelevant to those
who love Ruby! *)

Ruby Rocks !

The 58 servers next year will cost half as much. The developer next year
will cost the same or more.

I work in embedded systems. The hardware / software cost analysis turns
out different here. An embedded system will typically be deployed on
thousands and sometimes millions of devices. Here it is worth saving
pennies on compute hardware by spending big $$$ on software development
and trying to keep the code as small and optimized as possible. You will
still find many cheap 8bit and 16bit computers running your cars, phones
and refrigerators. The cost of finding developers who can deal properly
with those sorts of devices is not cheap.

However for a web server with a hardware deployment footprint orders of
magnitude lower than an embedded systems deployment it doesn’t make
sense to try to save money on hardware. Where you will blow your budget
is with the software developers.

With regards to Python/Ruby speed issues. I am currently watching a
complex scheduling algorithm ticking by very slowly in a shell while I
write this post. The algorithm should should have been written in C/C++
and not in Python/Psyco or Ruby and I am going nuts waiting for it to
complete.

( BTW I didn’t write it )

B

casper_the_ghost · September 27, 2007, 7:35am

On Sep 27, 2007, at 12:43 AM, Chad P. wrote:

I’m confused. If it worked . . . why did he throw it away and redo
it in
PHP?

Because he was able to do it himself, and then both read the code
and rewrite it.
Please read http://www.zedshaw.com/essays/c2i2_hypothesis.html,
Gadfly Festival section,
and all on the Big Rewrite by Chad F… In this case PHP was a
(roughly and not at all harshly pat) alternative
for an Excel spreasdheet with VBA macros - the issue of ownership I
guess. Nothing critical.

casper_the_ghost · September 27, 2007, 8:26am

On Thu, Sep 27, 2007 at 02:34:34PM +0900, julik wrote:

Gadfly Festival section,
and all on the Big Rewrite by Chad F… In this case PHP was a
(roughly and not at all harshly pat) alternative
for an Excel spreasdheet with VBA macros - the issue of ownership I
guess. Nothing critical.

How do you know that was the reason? The impression I got from the
lengthy, slashdotted explanation was that the project was unfinished
after two years, so he decided to junk it and start over in PHP.

casper_the_ghost · September 27, 2007, 9:57am

Brad P. wrote:

With regards to Python/Ruby speed issues. I am currently watching a
complex scheduling algorithm ticking by very slowly in a shell while I
write this post. The algorithm should should have been written in C/C++
and not in Python/Psyco or Ruby and I am going nuts waiting for it to
complete.

( BTW I didn’t write it )

“Complex scheduling algorithm” means different things to different
people. Is it slow because the algorithm sucks or slow because it’s not
written in C/C++? What kind of scheduling is it – combinatorial?

casper_the_ghost · September 27, 2007, 4:11pm

On Sep 27, 2007, at 8:25 AM, Chad P. wrote:

How do you know that was the reason? The impression I got from the
lengthy, slashdotted explanation was that the project was unfinished
after two years, so he decided to junk it and start over in PHP.

He decided to write it himself. That’s the main piece.

casper_the_ghost · September 27, 2007, 6:43pm

Chad P. wrote:

Yes, he decided to write it himself – after giving up on Rails, for
reasons that, as far as I’m aware, relate to the fact that it wasn’t
done in Rails after two years.

Actually, if I can be allowed to read between the lines, he went back
when his Ruby mentor left and he realized that he was not able to do it
in Ruby. He went back to what he knew when he was left on his own. He
sort of says that in the post.

casper_the_ghost · September 27, 2007, 7:33pm

On Fri, Sep 28, 2007 at 01:43:03AM +0900, Lloyd L. wrote:

Chad P. wrote:

Yes, he decided to write it himself – after giving up on Rails, for
reasons that, as far as I’m aware, relate to the fact that it wasn’t
done in Rails after two years.

Actually, if I can be allowed to read between the lines, he went back
when his Ruby mentor left and he realized that he was not able to do it
in Ruby. He went back to what he knew when he was left on his own. He
sort of says that in the post.

Well . . . yes, but judging by the phrasing he considers the main reason
for switching back to PHP to be that Rails didn’t get the job done in
two
years of development. He may well have stuck with Rails longer if the
“Ruby mentor” hadn’t departed for greener pastures, but a lot of effort
seems to be spent in that piece on pointing out that he got done in a
very short time what Rails hadn’t allowed in two years.

Note that I’m speaking of what the “analysis” seems to be saying, and
not
my own opinions of what Rails can or cannot do. It seems ridiculous to
me that someone couldn’t build a working web app in two years,
regardless
of the tool used in the development effort (as I’ve already said).

casper_the_ghost · September 27, 2007, 6:37pm

On Thu, Sep 27, 2007 at 11:10:17PM +0900, julik wrote:

On Sep 27, 2007, at 8:25 AM, Chad P. wrote:

How do you know that was the reason? The impression I got from the
lengthy, slashdotted explanation was that the project was unfinished
after two years, so he decided to junk it and start over in PHP.

He decided to write it himself. That’s the main piece.

Yes, he decided to write it himself – after giving up on Rails, for
reasons that, as far as I’m aware, relate to the fact that it wasn’t
done
in Rails after two years.

casper_the_ghost · October 3, 2007, 11:40am

On 23.09.2007 21:08, Phlip wrote:

forrie wrote:

I presume most people here read today’s article on Slashdot which had
some critique about Ruby and scaling to a large architecture.

Nope.

Same here.

I go with Dave T.'s verbiage “Ruby stays out of your way”. That says
it all - dynamic typing, clear simple statements, endless extensibility,
and realistic scaling, all in a nutshell.

I remember a previous gig where we used Java heavily and the scaling
was pretty linear. Need more space? Add another blade and so on…

That’s not scaling! (Okaaay, that’s only one aspect of scaling!)

It definitively is. One aspect of Ruby that hinders scaling is the
absence of native threads IMHO. On the other hand, mechanisms are
provided for IPC (DRb for example) which are easy to use and thus may be
counted as compensating at least partially for the lack of native
threading.

How did your Java design itself scale? The rate you add new features -
did it go up or down over time? That’s scaling. If the rate doesn’t
slow down, you have time to tune your code to speed it up and handle
more users…

IMHO this is not scaling (well, at least not if you follow common usage)
but extensibility or flexibility of the design which translates into
developer efficiency. Which does not say this is superfluous or the
wrong measure, not at all. I just don’t think “scalability” is the
wrong term here.

Kind regards

robert

casper_the_ghost · October 3, 2007, 11:46am

On 26.09.2007 05:36, John J. wrote:

2 years to rebuild in Rails?! How?!
Simple. You can’t force an existing database structure onto a framework
that has an ORM. Doesn’t work well if at all.

I think this statement with this level of generality is wrong. It
depends on the schema how well it fits a particular ORM tool.

You can migrate the data. easy.

That also depends on the schemas involved. The complexity of
translating a data or object model into another is mainly governed by
the similarity of the schemas.

Kind regards

robert

casper_the_ghost · September 28, 2007, 12:19am

M. Edward (Ed) Borasky wrote:

people. Is it slow because the algorithm sucks or slow because it’s not
written in C/C++? What kind of scheduling is it – combinatorial?

I have never looked at the algorithm myself. It is a TDMA message
scheduling algorithm. Given a set of messages on a time partitioned bus,
with multiple transmitters and receivers find the optimal message
schedule. It is not too different from the classic M$ Project scheduling
really. However you have thousands of messages to schedule and many
constraints. Still I am not sure that waiting 2 minutes to several hours
for a schedule to complete would occur if exactly the same algorithm was
written in C rather than Python. This is pure speculation however as
I’ve never had time to look over it myself and make a proper judgment.

However the argument from the developers is that real customers of the
tool rarely run a schedule. Once the schedule has been fixed it is
rarely changed. Therefore the pain of waiting for 5 minutes or even a
few hours in the case of a large schedule is little pain in the overall
scheme of a project that may last many years.

The only reason it bugs me is that I am constantly running schedules to
generate test cases for some other downstream code I am running. It
therefore causes me pain in ways it would not necessarily do for a real
customer. This brings us round to the original argument I guess. Perhaps
in this case Python is suitable. It is easy to code, and easy to
analyze, debug and maintain. The slowness is not a huge factor because
the customer just doesn’t require it to be “fast”

B

casper_the_ghost · October 3, 2007, 3:21pm

Chad P. wrote:

Assuming about an 80k salary and a 2,000 dollar server, a server is worth
about 50 hours of programmer time.

I just figured I’d provide a simple starting place for comparing the cost
of software development with that of hardware upgrades.

I find this perspective puzzling. In most large datacenters, the big
cost of operation is neither the cost of the servers nor the cost of the
development time to put code on them, it’s the peripheral electricity,
administration and cooling costs once the application written must be
deployed to thousands of users.

An application that scales poorly will require more hardware. Hardware
is cheap, but power and administrative resources are not. If you need 10
servers to run a poorly-scaling language/platform versus some smaller
number of servers to run other “faster/more scalable”
languages/platforms, you’re paying a continuously higher cost to keep
those servers running. Better scaling means fewer servers and lower
continuous costs.

Even the most inexpensive and quickly-developed application’s savings
will be completely overshadowed if deployment to a large datacenter
results in unreasonably high month-on-month expenses.

Charlie

casper_the_ghost · October 3, 2007, 3:22pm

2 years to rebuild in Rails?! How?!

Big companies get in this trouble when they practice “Big Requirements
Up
Front”. If you schedule a huge number of requirements, then try to do
them
all at the same time, you make development extremely hard. This is how
the
huge multi-billion dollar software project failures happen in the news.
Rails is not immune.

The correct way to replace an old project is a process pattern called
“Strangler Fig”. You ask the client what’s one tiny feature to add to
the
old system - what are they waiting for - and you implement it using the
new
technology. You link the old system to it, and you put it online as soon
as
possible.

A project that succeeds early cannot fail in 2 years. (It can be
cancelled,
at any time, of course, but with no difference between the cost and the
amount of features deployed.)

Then you ask the client what’s the next feature, and you implement this,
and
use it as an excuse to convert a little more of the old system into the
new
system. And if there’s no reason to completely retire the old system,
you
don’t. You could save 1 year like that!

Someone got ambitious, and actually believed Rails’s hype, and was
over-confident.

casper_the_ghost · October 3, 2007, 5:20pm

Thank you!! It’s about time somebody put a dollar figure on the cost of
poor scalability and highlighted the nonsense of “adding servers is
cheaper than hiring programmers.” They are two entirely different
economic propositions.

That’s true. However, very roughly, compute resource can scale about
linearly with compute requirement.

Alternatively, you can reduce the compute requirement by having a more
complex software system. However, the amount of programmer needed to
build and maintain a given complexity of software certainly doesn’t
scale linearly with the system complexity (see the mythical man month).

I’m also not a fan of throwing hardware at a problem as a cure all, and
I don’t like dumb “enterprise” grade solutions when there are much more
powerful alternatives. But I’m also not a fan of automatically assuming
that you need to build lots of clever and complex things, and that
existing components just wont do the job. Pragmatism seems like the best
approach

casper_the_ghost · October 3, 2007, 5:01pm

Charles Oliver N. wrote:

servers and sure Ruby scales just fine !
administration and cooling costs once the application written must be
Even the most inexpensive and quickly-developed application’s savings
will be completely overshadowed if deployment to a large datacenter
results in unreasonably high month-on-month expenses.

Charlie

Thank you!! It’s about time somebody put a dollar figure on the cost of
poor scalability and highlighted the nonsense of “adding servers is
cheaper than hiring programmers.” They are two entirely different
economic propositions.

casper_the_ghost · October 3, 2007, 5:39pm

On Thu, 4 Oct 2007 00:19:40 +0900, [email protected] wrote:

Thank you!! It’s about time somebody put a dollar figure on the cost of
poor scalability and highlighted the nonsense of “adding servers is
cheaper than hiring programmers.” They are two entirely different
economic propositions.

That’s true. However, very roughly, compute resource can scale about
linearly with compute requirement.

What about Amdahl’s law?

Alternatively, you can reduce the compute requirement by having a more
complex software system.

While it’s true that very simple systems can perform badly because
they use poor algorithms and/or do not make dynamic optimizations,
more complex software generally means increased computational
requirements.

-mental

casper_the_ghost · October 3, 2007, 6:43pm

On Thu, Oct 04, 2007 at 12:38:25AM +0900, MenTaLguY wrote:

On Thu, 4 Oct 2007 00:19:40 +0900, [email protected] wrote:

Thank you!! It’s about time somebody put a dollar figure on the cost of
poor scalability and highlighted the nonsense of “adding servers is
cheaper than hiring programmers.” They are two entirely different
economic propositions.

That’s true. However, very roughly, compute resource can scale about
linearly with compute requirement.

What about Amdahl’s law?

What about it? Unless you’re writing software that doesn’t scale with
the hardware, more hardware means linear scaling, assuming bandwidth
upgrades. If bandwidth upgrades top out, you’ve got a bottleneck no
amount of hardware purchasing or programmer time will ever solve.

Alternatively, you can reduce the compute requirement by having a more
complex software system.

While it’s true that very simple systems can perform badly because
they use poor algorithms and/or do not make dynamic optimizations,
more complex software generally means increased computational
requirements.

I thought “complex” was a poor choice of term here, for the most part.
It was probably meant as a stand-in for “more work at streamlining
design, combined with greater code cleverness needs to scale without
throwing hardware at the problem.”

casper_the_ghost · October 3, 2007, 7:17pm

On Oct 3, 9:19 am, Charles Oliver N. [email protected]
wrote:

I find this perspective puzzling. In most large datacenters, the big
cost of operation is neither the cost of the servers nor the cost of the
development time to put code on them, it’s the peripheral electricity,
administration and cooling costs once the application written must be
deployed to thousands of users.

“Most”? If you define “large data center” as the very top echelon,
then maybe, but even then I’d like to see some data. I expect the vast
majority (all?) of readers of this ng will be involved in scenarios in
which the cost of development time far exceeds electricity or server
costs for their deployed applications.

Part of what kept me from getting involved in Ruby sooner than I did
was my erroneous view that I wanted to be using technology that would
be sufficient to run Amazon, Ebay, etc. Little did it matter that I
wasn’t pursuing that type of project - analogous to the fact that
most, if not all, Hummer drivers will never encounter serious off road
or combat situations

I’m all for increasing the performance and scalability of Ruby, but I
think the productivity gains still outweigh the extra runtime costs
for most projects.

casper_the_ghost · October 3, 2007, 7:15pm

On Wed, Oct 03, 2007 at 10:19:57PM +0900, Charles Oliver N. wrote:

servers and sure Ruby scales just fine !
administration and cooling costs once the application written must be
deployed to thousands of users.

For very small operations, this is true.

An application that scales poorly will require more hardware. Hardware
is cheap, but power and administrative resources are not. If you need 10
servers to run a poorly-scaling language/platform versus some smaller
number of servers to run other “faster/more scalable”
languages/platforms, you’re paying a continuously higher cost to keep
those servers running. Better scaling means fewer servers and lower
continuous costs.

Actually, when people talk about something scaling well or poorly,
they’re usually talking about whether it scales linearly or requires an
ever-increasing inclusion of some resource. Something that scales very
well requires the addition of one more unit of a given resource to
achieve an increase in capability that matches up pretty much exactly
with the amount of capability per unit of resource already employed.
This is usually counted starting after an initial base resource cost.
For instance, if you have minimal electricity needs for lighting, air
conditioning, and a security system, plus your network infrastructure,
and none of that will need to be upgraded within the foreseeable future,
you start counting your electricity resource usage when you start
throwing webservers into the mix (for a somewhat simplified example).
If
you simply add one more webserver to increase load handling by a static
quantity of concurrent connections, you have linear (good) scaling.

On the other hand, if you have a system plagued by interdependencies and
other issues that make your scaling needs non-linear, that kind of
resource cost can get very expensive. Obviously, some software design
needs are part of determining the linearality of your scaling
capabilities, but such needs often involve factors like choosing a
language that makes development easier, a framework that is already
well-designed for scaling, and so on. A language that compiles to
relatively high-performance binaries, or one that is compiled to
bytecode
and executed by an optimizing VM, can help – but that doesn’t magically
make your software scale linearly. That’s dependent upon how the
software was designed in the first place.

Throwing more programmers at the problem certainly won’t result in a
system that scales linearly either. What a larger number of programmers
on a single project often does, in fact, is ensure that scaling
characteristics across the project are less consistent. You may end up
with one particular part of the overall software serving as a scaling
bottleneck because its design characteristics are sufficiently different
from the rest that it requires either a refactor or ever-increasing
resources as scaling needs get more extreme. Oh, and there’s one more
thing . . .

Even the most inexpensive and quickly-developed application’s savings
will be completely overshadowed if deployment to a large datacenter
results in unreasonably high month-on-month expenses.

Even the cheapest hardware and energy requirement will quickly become
astronomically expensive if you have to throw more programmers at it.
The more difficult a system is to maintain, the faster the needed
programmer resources grow. That’s the key: programming resources don’t
tend to scale linearly. Hardware resources, except in very poor
examples
of software design, usually do.