jRuby 10x slower than MRI for highly dynamic code?

dubstep · March 21, 2011, 1:39pm

Hello,

I’m trying to understand a performance difference of an order of
magnitude between the same sinatra code running on ruby 1.8.7 with
thin and on JRuby 1.6.0 deployed on JBoss 5.1.0.

I’ve added some simple execution time measurement using Time.now and I
get the following results:

Ruby 1.8.7: 0.107098s
JRuby 1.6.0: 1.433s

I’m sure I’m doing something wrong or I’ve tuned jRuby/JBoss wrongly,
but I don’t know exactly what.

You can find the code at http://pastebin.com/MZ9WaPQx with nice syntax
highlighting or below.

Thanks for any pointers in the right direction,

Johannes

P.S. The output of the code looks like the following:

{“aaData”:[[“22.03. 08h”,5.0],[“22.03. 10h”,5.8],[“22.03.
14h”,6.3],[“22.03. 20h”,6.1],[“23.03. 08h”,4.6],[“23.03.
10h”,6.4],[“23.03. 14h”,6.3],[“23.03. 20h”,5.5],[“24.03.
08h”,4.9],[“24.03.
10h”,7.0]],“iTotalRecords”:560,“iTotalDisplayRecords”:560,“sEcho”:“1”}

Code:

DataTables backend. See also DataTables - Usage

get
%r{^/patient/([\d]+)/table/(blood_pressure|blood_sugar|weight|alert|treatment)$}
do |patient_id,table|
start = Time.now

content_type :json
@user = get_user()
@patient = @user.get_patient(patient_id)
if @patient.nil?
raise Sinatra::NotFound
end

iDisplayStart = params[“iDisplayStart”].to_i
iDisplayStart = nil if iDisplayStart == 0

Fields and transformations for each table. See comment below.

fields = {
“blood_pressure” => [[[“time”], [“strftime”, “%d.%m. %Hh”]],
[[“systolic”]], [[“diastolic”]], [[“pulse”]]],
“blood_sugar” => [[[“time”], [“strftime”, “%d.%m. %Hh”]],
[[“glucose”], [“to_f”]]],
“weight” => [[[“time”], [“strftime”, “%d.%m. %Hh”]], [[“weight”],
[“to_f”]]],
“alert” => [[[“time”], [“strftime”, “%d.%m. %Hh”]],
[[“severity”]], [[“alert”]]],
“treatment” => [[[“time”], [“strftime”, “%d.%m. %Hh”]],
[[“slow_insulin_morning”]], [[“slow_insulin_night”]],
[[“fast_insulin_morning”]], [[“fast_insulin_lunch”]],
[[“fast_insulin_afternoon”]], [[“fast_insulin_evening”]],
[[“metformin_morning”]], [[“metformin_lunch”]],
[[“metformin_evening”]]]
}

Create a list of order terms as argument to sequel’s order function.

Results looks like [:time, :weight.desc]

iSortingCols = params[“iSortingCols”].to_i
order_by = if iSortingCols > 0
order_arr = []
table_fields = fields[table].length
iSortingCols.times do |n|
col_num = params[“iSortCol_#{n}”].to_i if
params[“iSortCol_#{n}”].match /^[0-9]/
col = fields[table][col_num][0][0].to_sym if col_num < table_fields
if not col.nil?
col = col.desc if params[“sSortDir_#{n}”] == “desc”
order_arr.push col
end
end
order_arr
else
[fields[table][0][0][0].to_sym.desc]
end

For aaData metaprogramming is used for general data driven code.

The list defined in fields is used to create a list of lists, where

each

list contains a data row.

[[“weight”], [“to_f”]] is thereby equivalent to row.weight.to_f

{
:iTotalRecords => @patient.agent.blood_pressures.count,
:iTotalDisplayRecords => @patient.agent.blood_pressures.count,
:sEcho => params[“sEcho”].to_i.to_s,
:aaData => @patient.agent.send(table + “s_dataset”).order(*order_by)
.limit(10,iDisplayStart).map do |row|
fields[table].map do |field|
field.reduce(row) do |obj,method|
obj.send(*method)
end
end
end
}.to_json
(Time.now - start).to_s
end

Johannes_Krampf · March 21, 2011, 1:57pm

Hi,
Have you tried running it on mongrel / torquebox / trinidad or glassfish
with jruby ?
Last time I checked running a warbled rails app directly on JBoss was
way slower than
running it directly on a mongrel server.

To the point, more info about your setup is required for us to properly
be able to evaluate your troubles.

Cheers,
Lus Landeiro R.

Johannes_Krampf · March 21, 2011, 2:21pm

Hi Luís,

thanks for your answer!

I’ve run it on Trinidad and it’s still 8x slower with ~0.805s. (After
multiple request to allow for warm up.)

I’ll try to give you all possible information about the setup. Please
tell me if some information is missing:

I’m running Debian 6.0 with Sun JRE 1.6 in a VirtualBox VM with 2GB
RAM on a Windows 7 SP1 computer (Dell Precision T1500: Intel Core i7
4x2.93GHz, 16GB RAM).

I start JBoss with JAVA_OPTS=“-Xms1024m -Xmx1024m
-XX:MaxPermSize=256m -Dorg.jboss.resolver.warning=true
-Dsun.rmi.dgc.client.gcInterval=3600000
-Dsun.rmi.dgc.server.gcInterval=3600000”.

I use JRuby 1.6.0 and I create a war file using warbler. In config.ru
I enable the production environment.

Is this the information which might help you? If not, which
information do you need?

Cheers,

Johannes

2011/3/21 Luís Landeiro R. [email protected]:

Johannes_Krampf · March 22, 2011, 1:18am

You might want to try to isolate the code that hits the DB from
everything
else and see if it’s JRuby itself, or the execution of the query that’s
running slowly. Maybe put in another (Time.now - start).to_s statement
around line 42 and compare those times. At least that could help you
isolate what is taking so long.

Joe

On Mon, Mar 21, 2011 at 8:20 AM, Johannes Krampf <

Johannes_Krampf · March 22, 2011, 8:45am

Hello Joe!

Thanks for your answer - you showed me the right path. (I should have
though of using the poor man’s profiler myself…)

Short answer: JRuby is just as fast as MRI and I was wrong to suspect
the Ruby code after seeing “Object.new” on the top place in ruby-prof.

The problem is related to DB access. While the query log shows query
times of less than 10 milliseconds, actually executing the query takes
quite some time longer, which caused the big difference between MRI
and JRuby.

I’ll research why it takes so much longer when using the JDBC driver
and thank you for your patience with me.

Cheers,

Johannes

2011/3/22 Joseph A. [email protected]:

Johannes_Krampf · March 22, 2011, 4:16pm

On Tue, Mar 22, 2011 at 8:44 AM, Johannes Krampf
[email protected] wrote:

The problem is related to DB access. While the query log shows query
times of less than 10 milliseconds, actually executing the query takes
quite some time longer, which caused the big difference between MRI
and JRuby.

I’ll research why it takes so much longer when using the JDBC driver
and thank you for your patience with me.

Since you’re running JRuby inside JBoss TX management is likely
different. That may be the reason or one of the reasons. Also, a JEE
server does a lot more stuff that might interfere.

Kind regards

robert

–
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Johannes_Krampf · March 23, 2011, 7:56am

It might just be that you need to optimize the way you perform DB
queries.

I suggest you have a look at Sequel together with JDBC, and store
queries in hashes or arrays using Sequel, and globally minimize the
amount of queries.

By doing so, my current DB implementation using JDBC/Sequel/Jruby is
actually better than MRI/DBI…

Christian

On Tue, Mar 22, 2011 at 8:44 AM, Johannes Krampf

Johannes_Krampf · March 23, 2011, 2:01pm

On Wed, Mar 23, 2011 at 1:42 PM, Johannes Krampf
[email protected] wrote:

Hi Christian,

you helped me solve this particular problem! (But I have new one…)

Glad the suggestion helped.

Get glucose data for presentation in graph

---------------±------
1300872719743 | 104

Both are integers and no conversion should be necessary. (I wanted to
try the :convert_types option, but I gave up after having to do too
many unrelated changes.)

From what I see you use sql_functions, which means the sql server will
do the calculations.

One possible improvement in your code would be to select and map on
the go, like this (untested, since I do not have access to your db):

def glucose_data()
self.blood_sugars_dataset.select_map([:round.sql_function(:extract.sql_function(‘epoch
from “time”’.lit)*1000), :glucose])
end

This code executes in ~130ms using MRI and postgres. The minimum time
I could get for the JRuby version after multiple requests is ~530ms.
The database takes 5ms according to Sequel’s query log.

It could be latency through the jdbc driver. The only solution is to
find a faster jdbc driver (usually you can find many alternatives for
1 backend) and to reduce to as little as possible the amount of
queries you do.

I’m afraid this might be related to my earlier problem, i.e. loading a
lot of data from the database into an array. Do you have any ideas,
what I might be doing wrong?

I don’t think loading some data from db into an array will be slowing
you down. On the contrary, if you later use a lot this data for a lot
of calculations, you’ll save a lot of time, compared to 1 calculation
requesting 1 sql query each time.

My 2 cents

Christian

Johannes_Krampf · March 25, 2011, 2:50pm

Hello everyone,

I made some changes to my environment and performance is now similar
for MRI and JRuby.

I use now a bare-metal Ubuntu 11.04 alpha running torquebox. There are
too many factors to find the reason for this slowdown.

Thanks again for all your help,

Johannes

2011/3/23 Christian MICHON [email protected]:

One possible improvement in your code would be to select and map on
the go, like this (untested, since I do not have access to your db):

def glucose_data()

self.blood_sugars_dataset.select_map([:round.sql_function(:extract.sql_function('epoch

from “time”'.lit)*1000), :glucose])
end

This works, but unfortunately this will result in tuples inside a
string in the resulting json, but thanks for the suggestion!

Johannes_Krampf · March 23, 2011, 1:43pm

Hi Christian,

you helped me solve this particular problem! (But I have new one…)

I issued a count on an array instead of the dataset resulting in
loading all data from the DB and counting the elements instead of a
SELECT COUNT(*) as intended. The code running on JRuby is now about as
fast as

Now I have a problem when I really need to load all data from the
database. I call the following function and issue a to_json on the
result.

# Get glucose data for presentation in graph
def glucose_data()
  self.blood_sugars_dataset.
     select(:round.sql_function(:extract.sql_function('epoch from

“time”'.lit)*1000), :glucose).
map{|row| [row[:round], row.glucose]}
end

This results in 29.4KB of uncompressed JSON data. There are only 560
rows in the database, each row is similar to the following
round | pulse
---------------±------
1300872719743 | 104

Both are integers and no conversion should be necessary. (I wanted to
try the :convert_types option, but I gave up after having to do too
many unrelated changes.)

This code executes in ~130ms using MRI and postgres. The minimum time
I could get for the JRuby version after multiple requests is ~530ms.
The database takes 5ms according to Sequel’s query log.

I’m afraid this might be related to my earlier problem, i.e. loading a
lot of data from the database into an array. Do you have any ideas,
what I might be doing wrong?

best regards,

Johannes

2011/3/23 Christian MICHON [email protected]: