Avoiding ActiveRecord loading for efficiency with lots of rows


#1

Hello all,

This is my first post to this group, so please let me know if there is
any protocol I’m supposed to follow that I missed. Anyway, I’ve been
developing a Rails application for handling the registration and score
data of a math competition. There is an existing Java applet that we
use that generates the score data with reference to problem_id,
solver_id, and the result (which is - if blank, 0 if wrong, 1 if
right, and =n if n points of partial credit are to be awarded). It
saves each individual’s answer to each question as a separate row in
the scores table.

I’ve wrote a script to process this data in Rails. However, in 2008
there were over 15,000 rows in the scores table, and my current code
loads every one of them. My current load time is about 5 seconds, and
I think it’s due to making an ActiveRecord object for every one. Is
there a way to bypass making a separate object for each? I do need
access to the data, but it would be nicest as an array or something
under Solver. One solution I’m currently considering is using raw SQL
for this, but I thought I’d see if anyone had any thoughts to weigh
in.

Thanks a lot!

Greg


#2

Quoting Greg B. removed_email_address@domain.invalid:

saves each individual’s answer to each question as a separate row in
in.

There are several ways:

  • A Problem has_many Scores, a Solver has_many Score, and A Score
    belongs_to
    both a Problem and a Solver. Process one Problem at a time, or one
    Solver
    at a time.

  • SQL, e.g. select_rows(‘SELECT problem_id, solver_id, result FROM
    scores’)
    Returns an array of array of strings with the values in the order
    specified.
    E.g. [[‘123’, ‘345’, ‘-’], # problem 123, solver 345, result ‘-’
    [‘124’, ‘1024’, ‘123’], # problem 123, solver 1024, result
    ‘123’

    Remember, all values are strings and may need to be converted to
    integers.

  • Use will_paginate to retrieve results in clumps, e.g. 100 results at a
    time.

HTH,
Jeffrey


#3

Greg B. wrote:

Hello all,

This is my first post to this group, so please let me know if there is
any protocol I’m supposed to follow that I missed. Anyway, I’ve been
developing a Rails application for handling the registration and score
data of a math competition. There is an existing Java applet that we
use that generates the score data with reference to problem_id,
solver_id, and the result (which is - if blank, 0 if wrong, 1 if
right, and =n if n points of partial credit are to be awarded). It
saves each individual’s answer to each question as a separate row in
the scores table.

I’ve wrote a script to process this data in Rails. However, in 2008
there were over 15,000 rows in the scores table, and my current code
loads every one of them. My current load time is about 5 seconds, and
I think it’s due to making an ActiveRecord object for every one. Is
there a way to bypass making a separate object for each? I do need
access to the data, but it would be nicest as an array or something
under Solver. One solution I’m currently considering is using raw SQL
for this, but I thought I’d see if anyone had any thoughts to weigh
in.

Thanks a lot!

Greg

http://enterpriserails.rubyforge.org/hash_extension/

Haven’t tried this myself, but it should be faster than loading 15k
ActiveRecord objects.