Ruby on Solaris 10 performance problems


#1

We just installed ruby on a
Sun T1000, 6 core UltraSPARC T1 cpu, 4G memory , Solaris 10

Right now Windows is out performing an Ultra SPARC by 25 seconds! Does
anyone know why this would be the case. I have downloaded ruby and
compiled it on the hardware, i have tried the binaries from sunfreeware
and the results are the same. Actually, the results from my compile were
a second or two worse than the binaries from SUN!

This is the program ran for the bench mark. I called it t.rb

require ‘benchmark’

array = (1..1000000).map { rand }

Benchmark.bmbm do |x|
  x.report("sort!") { array.dup.sort! } #this sorts the array.
  x.report("sort")  { array.dup.sort  } #this makes a copy of the

array and sorts it.
end

###########################################

Solaris v10

ruby t.rb

Rehearsal -----------------------------------------
sort! 29.450000 0.020000 29.470000 ( 29.458899)
sort 29.760000 0.010000 29.770000 ( 29.772163)
------------------------------- total: 59.240000 sec

        user     system      total        real

sort! 29.060000 0.010000 29.070000 ( 29.064410)
sort 29.070000 0.000000 29.070000 ( 29.076217)

Windows

Microsoft Windows XP [Version 5.1.2600]
© Copyright 1985-2001 Microsoft Corp.

C:\Documents and Settings\cmackenzie>ruby t.rb
Rehearsal -----------------------------------------
sort! 4.735000 0.000000 4.735000 ( 4.813000)
sort 4.359000 0.000000 4.359000 ( 4.390000)
-------------------------------- total: 9.094000 sec
user system total real
sort! 4.313000 0.000000 4.313000 ( 4.375000)
sort 4.297000 0.015000 4.312000 ( 4.422000)

C:\Documents and Settings\cmackenzie>

Ubuntu Linux

colmac@sideshowbob: ruby t.rb
Rehearsal -----------------------------------------
sort! 3.310000 0.010000 3.320000 ( 3.317860)
sort 3.250000 0.020000 3.270000 ( 3.270037)
-------------------------------- total: 6.5890000 sec

        user     system      total        real

sort! 3.280000 0.030000 3.10000 ( 3.318575)
sort 3.250000 0.020000 3.270000 ( 3.269284)
colmac@sideshowbob:


#2

On Jan 28, 12:36 pm, Colin M. removed_email_address@domain.invalid wrote:

We just installed ruby on a
Sun T1000, 6 core UltraSPARC T1 cpu, 4G memory , Solaris 10

Right now Windows is out performing an Ultra SPARC by 25 seconds! Does
anyone know why this would be the case.

What version of Ruby?
Which compiler?
What flags did you build with?
Have you seen http://tinyurl.com/dln99p ?

Regards,

Dan


#3

What version of Ruby?
ruby 1.8.7 (2008-08-11 patchlevel 72) [sparc-solaris2.10]

Which compiler?
gcc (GCC) 3.4.6

What flags did you build with?
I ran configure , what ever default flags it used.

Have you seen http://tinyurl.com/dln99p ?
no I will check it out.

uname -a

SunOS tk3 5.10 Generic_137137-09 sun4v sparc SUNW,Sun-Fire-T1000

btw, downloaded binaries from blastwave.org and the results are even
worse…
#./ruby --version
ruby 1.8.7 (2008-06-20 patchlevel 22) [sparc-solaris2.8]uname

./ruby t.rb

Rehearsal -----------------------------------------
sort! 34.080000 0.020000 34.100000 ( 34.098341)
sort 33.430000 0.010000 33.440000 ( 33.443781)
------------------------------- total: 67.540000sec


#4

Robert K. wrote:
…snip… IMHO it’s a dead technology.

Thanks, Robert, I had a bad feeling about that.


#5

2009/1/28 Colin M. removed_email_address@domain.invalid:

We just installed ruby on a
Sun T1000, 6 core UltraSPARC T1 cpu, 4G memory , Solaris 10

Right now Windows is out performing an Ultra SPARC by 25 seconds! Does
anyone know why this would be the case.

SPARC processors are slow, raw CPU speed is not among the strengths of
those beasts - especially since Ruby does not use native threads.
You’re putting the load on a single core only (well, your test does
not contain any concurrency anyway :slight_smile: ).

It’s pretty easy to see why: RISC has fewer machine instructions which
take one clock to execute. Nowadays CISC processors come close to
taking one clock as well by using pipelining, branch prediction and
what not. But CISC processor commands are more powerful, so you need
multiple RISC commands to do the same amount of work. And since clock
speeds slowly reach physical limits RISC cannot compensate with higher
clock rates. That’s why RISK is falling behind. IMHO it’s a dead
technology.

Kind regards

robert


#6

On Jan 28, 2:36 pm, Colin M. removed_email_address@domain.invalid wrote:

We just installed ruby on a
Sun T1000, 6 core UltraSPARC T1 cpu, 4G memory , Solaris 10

Right now Windows is out performing an Ultra SPARC by 25 seconds! Does
anyone know why this would be the case. I have downloaded ruby and
compiled it on the hardware, i have tried the binaries from sunfreeware
and the results are the same. Actually, the results from my compile were
a second or two worse than the binaries from SUN!

I wonder what kind of performance you’d get if you used JRuby 1.1.6+


#7

2009/1/29 Colin M. removed_email_address@domain.invalid:

Robert K. wrote:
…snip… IMHO it’s a dead technology.

Thanks, Robert, I had a bad feeling about that.

:slight_smile:

I don’t say that Sparc machines or Solaris operating systems do not
have their merits. It’s just that raw CPU speed is not among them.

Cheers

robert


#8

What machine are

Microsoft Windows XP [Version 5.1.2600]

and

Ubuntu Linux

running on?

mfg, simon … l


#9

Simon K. wrote:

What machine are

Microsoft Windows XP [Version 5.1.2600]

and

Ubuntu Linux

running on?

mfg, simon … l

at this point it does not matter. really the only question was why it is
running so slow on Solaris platform. Robert gave me the answer. Thanks
though.
If you are interested
Ubuntu is on an intel E4500@2.2GHz
Windows is on a Dell LapTop D630


#10

Mark T. wrote:

On Jan 28, 2:36�pm, Colin M. removed_email_address@domain.invalid wrote:

We just installed ruby on a
Sun T1000, 6 core UltraSPARC T1 cpu, 4G memory �, Solaris 10

Right now Windows is out performing an Ultra SPARC by 25 seconds! Does
anyone know why this would be the case. I have downloaded ruby and
compiled it on the hardware, i have tried the binaries from sunfreeware
and the results are the same. Actually, the results from my compile were
a second or two worse than the binaries from SUN!

I wonder what kind of performance you’d get if you used JRuby 1.1.6+

java runs 7 times slower on this RISC platform, just going to drop it.
Every thing runs slow


#11

Colin,

You have an answer, but it isn’t the right answer.

An Ultrasparc T1 processor is a little more powerful than either of
the Core 2 Duo CPUs you tested. You can see this from the published
results for the SPECjbb2005 or the specweb benchmarks.

The confusion when using your application as a benchmark, you only
make use of 5% of the CPU resources of the Sun box. The T1 processor
only runs at a clock speed of 1GHz but, being both multithreaded and
multi-core, it gives you a total of 24 threads to push work through.
If you used a benchmark that ran 24 or more instances of your
application you could expect to see greater throughput from the Sun
host than the Intel hosts.

This is a great example of the “End of Moore’s Law” issue. We are at
the cusp of a change in hardware technology that could force all
developers to learn about concurrency and parallelizing workloads. The
difference between these two architectures is that Sun embraced the
issue a little earlier than Intel.

I agree with Robert that RISC has had its day in the sun (pun not
intended), but I disagree with the suggestion that this is due to
technical inferiority. The reality is that Sun sat pretty earning
great margins for their hardware for more than a decade. No longer
commercially relevant its ironic that they are now, perhaps for the
first time, competitive on a performance vs price basis. But it’s too
lae. It doesn’t matter that Solaris on a Sun server is, in some ways,
technically superior to a Linux on Intel platform.

I would never choose Solaris today because it would be like buying a
Beta VCR, or a NeXT cube in the 1980s. Its sad that a company
responsible for so many technical innovations isn’t succeeding
economically but that’s life. Linux on Intel is the safe corporate
choice today. Funny to remember that installing Slackware on a 386 in
1996 made me feel like a revolutionary.

Peter


#12

Robert K. wrote:
CISC processor commands are more powerful, so you need

multiple RISC commands to do the same amount of work.

The original logic for RISC is it was better to use real estate for
registers than rarely used instructions. However with the relentless
march of Moore’s law we’ve not only got to where you can have ocean of
registers on an X86 chip; you can now have several more complete cores
on the same die!

That said, we note Solaris was acutually running on X86 so its’ an OS
thing.


#13

Thanks Peter and Robert for all your insights about his subject. This
whole thing came to light when I took one of our web based applications
which is a Terminal management System and was going to port it to
Solaris system which happen to be a RISC platform. It was originally
written on a Ubuntu platform using an intel CPU and was going to be
ported to Red Hat Linux Enterprise on a Dell Blade system. Then
management changed and wanted it on Solaris. Long story there. When we
finally got it running it was discovered it took 5 seconds to load a
simple login page! It is a Ruby on Rails application and I found it was
during a simple request to a controller to build a very simple page. The
process used arrays to do “certain” things during the process of
building the page. Hence, my simple ruby program to test. I simply cant
see any reason to re-architect the application so it will run on a RISC
platform. So in the end I am going to push for the original plan and put
this on a blade system of some type. Thanks again and yeah I am old
enough to remember those 50 floppies of Slackware except I was a
Yggdrasil fan :slight_smile:


#14

2009/1/30 Peter B. removed_email_address@domain.invalid:

You have an answer, but it isn’t the right answer.

I am not sure I understand what you mean by that since you seem to
rather reconfirm what I wrote:

An Ultrasparc T1 processor is a little more powerful than either of the
Core 2 Duo CPUs you tested. You can see this from the published results for
the SPECjbb2005 or the specweb benchmarks.

Colin was not interested to learn how much potential a SPARC CPU has
but he wanted to know why his SPARC was outperformed by an Intel box.

The confusion when using your application as a benchmark, you only make use
of 5% of the CPU resources of the Sun box. The T1 processor only runs at a
clock speed of 1GHz but, being both multithreaded and multi-core, it gives
you a total of 24 threads to push work through. If you used a benchmark that
ran 24 or more instances of your application you could expect to see
greater throughput from the Sun host than the Intel hosts.

As I said, there was just a single thread in the benchmark and this is
where SPARC processors fail miserably.

This is a great example of the “End of Moore’s Law” issue. We are at the
cusp of a change in hardware technology that could force all developers to
learn about concurrency and parallelizing workloads. The difference between
these two architectures is that Sun embraced the issue a little earlier than
Intel.

Frankly, I am not too optimistic that concurrency will be ubiquitous
soon. There are several reasons for this: judging from what I read in
public forums the concept seems to be difficult to grasp for many
people. Especially testing a multithreaded application is
significantly more complex than testing a single threaded application.
And, lastly, there is a vast amount of software that does exist
already and scarcely uses multithreading; in other words, efforts to
convert this are very high.

Side note: when I was at the university I picked a lecture about
communication in parallel computational models because at that time my
university (Paderborn, Germany) had one of the largest multiprocessor
systems around and was recognized as strong in that area. I did not
follow that path further on because it was easy to see that the theory
was still immature, Big-O calculus had large constants (i.e. algorithm
would be faster from a few million CPUs on) and network topologies had
to be tailored to the algorithm.

I agree with Robert that RISC has had its day in the sun (pun not intended),
but I disagree with the suggestion that this is due to technical
inferiority. The reality is that Sun sat pretty earning great margins for
their hardware for more than a decade. No longer commercially relevant its
ironic that they are now, perhaps for the first time, competitive on a
performance vs price basis. But it’s too lae. It doesn’t matter that Solaris
on a Sun server is, in some ways, technically superior to a Linux on Intel
platform.

IMHO Sun’s good position is not attributed to fast CPU speeds but
rather features that make Solaris systems good server systems:
reliability, fault tolerance, IO performance etc. I guess in practice
most applications that need to scale to large amounts of users require
large IO bandwidth rather than CPU power (just think of typical web
applications like online shops).

I would never choose Solaris today because it would be like buying a Beta
VCR, or a NeXT cube in the 1980s.

Now you’re getting more pessimistic about it than me. :slight_smile:

Its sad that a company responsible for so
many technical innovations isn’t succeeding economically but that’s life.
Linux on Intel is the safe corporate choice today. Funny to remember that
installing Slackware on a 386 in 1996 made me feel like a revolutionary.

Oh yes, I also remember those days when I copied Slackware on 50+
floppy disks and installed it at home on my 386. Those are the days
where the phrase about the largest text adventure application
originated. :slight_smile:

Kind regards

robert


#15

Colin M. wrote:

java runs 7 times slower on this RISC platform, just going to drop it.
Every thing runs slow

7 times slower than what? Non-RISC?

At least with JRuby you could parallelize easily in the same script,
Even the fastest SPARCs are nowhere near the clock speed of typical x86
chips, but they’re smaller, use less power, and you can fit dozens of
them in a machine.

  • Charlie

#16

2009/1/30 Colin M. removed_email_address@domain.invalid:

Thanks Peter and Robert for all your insights about his subject. This
whole thing came to light when I took one of our web based applications
which is a Terminal management System and was going to port it to
Solaris system which happen to be a RISC platform. It was originally
written on a Ubuntu platform using an intel CPU and was going to be
ported to Red Hat Linux Enterprise on a Dell Blade system. Then
management changed and wanted it on Solaris. Long story there. When we
finally got it running it was discovered it took 5 seconds to load a
simple login page!

That sounds extremely long. I’d dig into that issue. Do you have
database indexes in place etc.? I also believe that Rails has some
options that you can switch off because they are convenient during
development but slow down production.

It is a Ruby on Rails application and I found it was
during a simple request to a controller to build a very simple page. The
process used arrays to do “certain” things during the process of
building the page.

Talk about vague information… :slight_smile:

Hence, my simple ruby program to test. I simply cant
see any reason to re-architect the application so it will run on a RISC
platform. So in the end I am going to push for the original plan and put
this on a blade system of some type. Thanks again and yeah I am old
enough to remember those 50 floppies of Slackware except I was a
Yggdrasil fan :slight_smile:

You either have a configuration issue or your app is slower than
necessary on any platform. I’d say if you want scalability from this
I would dig into the performance issue.

Cheers

robert


#17

On 2009-01-30 18:14:23 -0500, Robert K. removed_email_address@domain.invalid
said:

Talk about vague information… :slight_smile:
I would dig into the performance issue.

Cheers

robert

I’d second Robert’s comment. If your web app takes 5 seconds to build a
page on the SUn box and 500msec on the Intel box then it is slow on
both. I have found that New Relic does a great job of drilling into the
Rails side of problems, whilst pagetest,
http://performance.webpagetest.org:8080/ does the same at the


#18

On 2009-01-30 11:28:49 -0500, Colin M. removed_email_address@domain.invalid said:

Thanks Peter and Robert for all your insights about his subject. This
whole thing came to light when I took one of our web based applications

This illustrates the importance of complete disclosure of information
and context.

I had assumed this was a long running batch server. The fact that this
is a web application tells us that response time rather than throughput
is the variable of interest here.

Before learning that this was a web application I had disagreed with
Robert’s
view that the Intel PC was outperforming the Sun.

which is a Terminal management System and was going to port it to
Solaris system which happen to be a RISC platform. It was originally
written on a Ubuntu platform using an intel CPU and was going to be
ported to Red Hat Linux Enterprise on a Dell Blade system. Then
management changed and wanted it on Solaris. Long story there. When we
finally got it running it was discovered it took 5 seconds to load a
simple login page! It is a Ruby on Rails application and I found it was

My opinion is that, in general, 100msec is reasonal build time target
for web-ages.
Five seconds to load a login page indicates that your application or
infra has a real problem.
That its less noticeabe on Intel doesn’t mean you’re home clear.

The best way to dig into this sluggishness is with new relic. It is a
superb production profiler