I need help saving table data from a rake task

I need to find out how I can create and save a large dataset to a table
based on multiple returned arrays from a Rake task.

Here is my example using just two arrays (there are 14 in this
particular rake task):

update_tsos_offense = TsosOffense.new
to_team_id, to_ppcs = update_tsos_offense.calculate_tsos(TotalOffense,
“ydspgm”, “desc”)
ro_team_id, ro_ppcs = update_tsos_offense.calculate_tsos(RushingOffense,
“ydspg”, “desc”)

This task starts with creating a new object (TsosOffense) which is the
model that houses the table I will eventually write/save data to.

It then calls a method from this new object and returns 2 results from
the model method for each call.

(e.g. to_team_id, to_ppcs all return Total Offense Team IDs, and a Total
Offense PPCS rating value)

(e.g. ro_team_id, ro_value, ro_ppcs all return Rushing Offense Team IDs,
and a Rushing Offense PPCS rating value)

The similarities between the two are:

Each array returned contain 120 records corresponding to 120 teams.
to_team_id matches a Team ID in the Teams Table (TsosOffense belongs_to
team)
ro_team_id matches a Team ID in the Teams Table (TsosOffense belongs_to
team)

The tsos_offenses table will have 15 columns:

A Team ID column for each of the 120 teams
A column for the PPCS values returned from each rake subtask…

===================

The million dollar question is I have all of the team IDs and PPCS
values for each call. How do I combine the entire set, build the table
according to each column, and save the table data (from the rake task).

Please keep in mind that TsosOffense.new is open and active for all
subtasks…

I’m sure it’s something similar to TsosOffense.save (but I need to
understand how to build the table with each data subset corresponding to
the column).

Thank you.

(e.g. ro_team_id, ro_value, ro_ppcs all return Rushing Offense Team IDs,
and a Rushing Offense PPCS rating value)

Is supposed to read ro_team_id, ro_ppcs only… my apologies…

The diagram for my TsosOffense table will look like:

:team_id
:to_ppcs
:ro_ppcs
:po_ppcs
:so_ppcs
etc.
etc.

The ppcs columns correspond to the names of the variables for each
subtask that are storing the values…

The matches are:

to_teamid | to_ppcs
ro_teamid | ro_ppcs
po_teamid | po_ppcs
so_teamid | so_ppcs
etc.
etc.
… for 14 total rake subtasks

each of the matches contain 120 teams and 120 PPCS values corresponding
to that column/field.

I need to organize the entire dataset so that…

to_ppcs values go into the to_ppcs column where to_teamid == team_id
ro_ppcs values go into the ro_ppcs column where ro_teamid == team_id
po_ppcs values go into the po_ppcs column where po_teamid == team_id
so_ppcs values go into the so_ppcs column where so_teamid == team_id
etc.
etc.
… for 14 total arrays…

Each array contains exactly 120 rows of data…

I hope this additional information helps.

On Jul 17, 2009, at 11:21 AM, Älphä Blüë wrote:

etc.
etc.
… for 14 total arrays…

Each array contains exactly 120 rows of data…

I hope this additional information helps.

I think that you aren’t getting much response because:
A) you aren’t giving the right amount of detail
B) you seem to be asking for design help

From your other thread, http://pastie.org/548692 shows that you are
calling a .compiled_this_week method on each of several models, but
you don’t show that code (or even the inspect’ed array of data that
you have).

I think that a rake task is probably the wrong way to approach this.
It’s more likely that you want to do a script/runner Tsos.update (for
some method ‘update’ on the class Tsos). Your approach seems very much
procedural and not at all object-oriented.

If you back up just a bit and give some of your assumptions (like how
you get data into the database in the first place), you might get a
few useful hints as to how to proceed. Since you are asking your
questions on the Rail mailing list (or posting to that forum), I’m
going to assume that you do actually have a Rails application
surrounding this data, however, you haven’t really been asking Rails
questions so far. They’ve either been Ruby questions or design
questions. That might be part of the problem, too.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Hi Alpha,

On Fri, 2009-07-17 at 17:09 +0200, Älphä Blüë wrote:

I need to find out how I can create and save a large dataset to a table
based on multiple returned arrays from a Rake task.

Since you’re already using Rake, the first approach I’d take to the task
would be to create a yaml file from the data returned above and then
load it using the db:fixture:load task. Could potentially combine it
all into a single Rake task. Just a thought. Googling ‘rails create
yaml from object’ returns some potentially useful code snippets.

HTH,
Bill

Hi Rob,

Here are my assumptions:

I utilize Rake Tasks in the same way as one would a cron job - to
perform a procedural task on “something” once per week. The rake tasks
I create will be manually run at the start but in the end, I will have
cron jobs run each of the rake tasks that I create one to three times
per week.

What do my rake tasks do?

  1. Rake task one reaches out to a number of official ncaa sites,
    parses/scrapes all of the data for the given week and uploads that data
    into 37 statistical tables.

  2. Rake task two (the one I’m working on now which is still incomplete)
    compiles a ratings strength number for each team listed in each
    statistical table (37 tables), categorizing the data by offense,
    defense, special teams, and turnover margin. This data is then saved
    into their respective “ratings” tables (offense, defense, special teams,
    turnover margin).

  3. Rake task three (is not designed yet but will go through the user
    tables and purge membership accounts based on a weekly subscription
    format.

These are the only 3 rake tasks I plan on using with my site for now.

About Design:

I’m really not asking for design help. I know how I personally see it
and how it “should” work. I’m asking if it can be done in the way I’ve
designed it. So far, all of the rake tasks I’ve created work perfectly
for what I’ve designed them to do thus far. You are correct in that I
have not supplied a lot of code with respect to rake task #2. The fact
is, I cannot supply more than the mechanics of it. The ratings system
is a private system that includes a lot of math functionality and is the
core reason why my site works so well.

Quoting Kenneth Massey (One of the BCS computer gurus from an email he
wrote to me):

=========================
Joel,

Thanks for sending me info about your system. I have added it here:
http://www.masseyratings.com/cf/compare.htm
under “Dezenzio TSRS”. It doesn’t correlate well with the other
rankings, but that’s actually a good thing - because your ranking system
is unique.

I like your idea of adjusting the raw statistical numbers to account for
strength of sched, and using st. dev. to normalize them is good.

Kenneth

I’ve worked really hard (over two years) to create the TSRS ratings
system and so I have to protect how it’s calculated and the way it
checks/offsets for variance.

So, I can only supply the “mechanics” of how I’m performing the task at
hand. There’s really no reason to show more than what I have.

I’ve isolated my question to the following:

(My Question)

Given 14 variables containing a foreign key value (team_id) that are
matched/paired with 14 variables containing a ratings number (PPCS), how
do I save the data I’ve already compiled inside of a rake task to the
table object that is currently open?

My assumption on the process would most likely be something similar to:

Create a constant for the table fields I want to update the values for:

TSOS_OFFENSE = [:team_id, :to_ppcs, :ro_ppcs, :po_ppcs, :so_ppcs,
etc…]

This would house the team_id foreign key and the columns that I want to
update.

update_tsos_offense.rows.each do |row|
values = {:compiled_on => Date.today.strftime(‘%Y-%m-%d’)}
constant.each_with_index do |field, i|
values[field] = row[i]
end
model.create values
end

update_tsos_offense is the name of the variable that opened the
Model.new object in my rake task.

Compiled on is a set date field that I use to write out the exact date
without time for each of the rows I add to my tables. I do this so that
I can allow people to use a calendar widget I created on my site to find
exact date matches…

Would something like this work? Or, do I need to tailor it some more?

Please keep in mind that I’m still new to rails and so I’m trying to be
as informative as I can. I do provide a lot of information. If it’s
the wrong kind of information then please enlighten me as to what
information I should be adding to provide better results for my
question.

Thanks a lot Bill…

That makes a lot of sense and helps me out a ton. Let me see if I can
create a very simplified piece, pastie the code, and then see what help
I can get with it.

I will post back here shortly.

Hi Joel (at least now we know your name :wink: )
On Fri, 2009-07-17 at 20:08 +0200, Älphä Blüë wrote:

You are correct in that I have not supplied a lot of
code with respect to rake task #2. The fact is, I
cannot supply more than the mechanics of it. The
ratings system is a private system

A lot of us here have day jobs that preclude us posting ‘real’ code.

The solution to the dilemma we, and you, face (i.e., needing help and
not being able to supply the information folks here need to have to
provide it) is what’s known as ‘a sandbox.’

What you do is create a separate app that’s scaled down to include only
the components that are absolutely necessary to demonstrate the problem
you’re having. That’s what you post and ask questions about.

One of the things that has made it difficult to assist you is the excess
of information you’re providing. We’re here to help but, as I’m sure
you’ve noticed, there are more folks asking questions here than
providing answers. The more specific you make your problem statements,
the more likely it is that we’ll be able to help without leaving others
unattended.

HTH,
Bill

Here is the pastie code of my sandbox app:

http://pastie.org/549855

Please let me know if there’s other information you would like to see.
Thank you greatly (in advance) for any assistance on this.

Can I make a recommendation? Write some tests. That way you can tell
what exactly is passing and what is failing your expectations. It may
help you find your problem. It might be even easier for others to help.

On the other hand, if you want to push forward, it seems like the
pattern you are looking to implement is:

  • Grab data from an external data source and calculate statistics for
    some teams, each with a known unique arbitrary identifier.
  • Find the existing data for that same team and update each row
    matching the arbitrary identifier with the new data.

Questions:

  • Is the above an accurate assessment?
  • Do you have the first part handled?
  • Do you have any of the second part handled?

This would help me and possibly others on the list understand where
you think your progress has brought you and help us get you unstuck.

On Jul 17, 2009, at 1:28 PM, Älphä Blüë wrote:

Steve R. wrote:

Can I make a recommendation? Write some tests. That way you can tell
what exactly is passing and what is failing your expectations. It may
help you find your problem. It might be even easier for others to help.

The tests aren’t the issue Steve. All tests prior to this point work
100%. The issue is I do not “know how” to update [one] table with 14
arrays of data.

  • Grab data from an external data source and calculate statistics for
    some teams, each with a known unique arbitrary identifier.

No - not accurate. I already pulled the data. The data exists in my
“personal” tables (37 of them). This data is exact and completed from a
development and testing scenario.

  • Find the existing data for that same team and update each row
    matching the arbitrary identifier with the new data.

Find the existing data within 1 of 37 tables, perform mathematical
calculations on that data, assign a ratings value for each team, send
the data back to rake to be held in “queue” [persistent data].

Find the existing data within 2 of 37 tables, etc., etc. to be held in
“queue” [persistent data].

Do this for 14 of 37 tables. These 14 tables now make up 28 arrays (14
arrays that are paired with another by foreign_key).

Array 1 houses the team_id for table 1
Array 2 houses the rating for team_id for table 1

Array 3 houses the team_id for table 2
Array 4 houses the rating for team_id for table 2

etc…

I have all of these arrays populated with data, all verified through
testing, and all that check out perfectly…

==========

Now, the problem I’m having relates to:

Organizing the data into:

– one team_id (which represents array 1, array 3, array 5, etc.
– field one (represents Array 2 rating)
– field two (represents Array 4 rating)
– field three (represents Array 6 rating)
… etc.
– field fourteen (represents Array 28 rating)

Finally, saving the data to the Table object that is currently open in
Rake in the exact format specified above…

On Jul 17, 2009, at 2:17 PM, Älphä Blüë wrote:

Finally, saving the data to the Table object that is currently open in
Rake in the exact format specified above…

So if I were to abstract this one level, I would say: “you want a
container-like data structure that can describe a unique odd-numbered
team id and 14 ratings for even-numbered teams. By iterating this
container, you will then update each database row that corresponds to
the team id.” Is this a correct description of your goal?

Sorry if I’m not getting this.

Steve R. wrote:

On Jul 17, 2009, at 2:17 PM, Älphä Blüë wrote:

Finally, saving the data to the Table object that is currently open in
Rake in the exact format specified above…

So if I were to abstract this one level, I would say: “you want a
container-like data structure that can describe a unique odd-numbered
team id and 14 ratings for even-numbered teams. By iterating this
container, you will then update each database row that corresponds to
the team id.” Is this a correct description of your goal?

Sorry if I’m not getting this.

hehe, hey steve, no really it’s okay mate. I wish I could describe
things better. I believe it’s because I’ve been working now about 11
hours a day 7-days a week on this project and I’m just a bit brain
fried.

Look back at the pastie code again and look all the way to the bottom.

I wrote a tiny loop in rake that showcases all of the arrays and their
data.

From that view point, I have all of the data. However, let’s take a
small step into the problem using just two examples…

array 1 holds team_id for table one.
array 2 holds rating value for table_id for table one.
array 3 holds team_id for table two.
array 4 holds rating value for table _id for table two.

All of these arrays contain exactly 120 rows of data. But, i can’t
simply iterate them or match them up… If I did so I would see:

Rownum | Array 1 | Array 2 | Array 3 | Array 4
0 65 43.43 47 97.34

Notice that array 1 and array 3 which hold the team IDs do not match up.
Therefore I can’t just save the data by rows…

I need the existing data organized into a complete array so it looks
like:

Rownum | Array 1 | Array 2 | Array 3 | Array 4
0 1 42.14 1 18.97
1 2 97.32 2 49.97
2 3 54.22 3 87.12

as you can see by this view, I want all of the arrays gathered into one
large array, sorted by team_id.

I mean perhaps my issue is that when I return the data to rake, I need
to first sort the information somehow and then return it to rake. That
way all 28 arrays are already sorted by team_id and can just be iterated
over 120 rows…

I hope this makes sense.

On Jul 17, 2009, at 5:55 PM, Älphä Blüë wrote:

the team id." Is this a correct description of your goal?
I wrote a tiny loop in rake that showcases all of the arrays and their
All of these arrays contain exactly 120 rows of data. But, i can’t
like:
I mean perhaps my issue is that when I return the data to rake, I need
to first sort the information somehow and then return it to rake.
That
way all 28 arrays are already sorted by team_id and can just be
iterated
over 120 rows…

I hope this makes sense.

I think the underlying difficulty is that you need to learn about a
collection other than Array :wink:

Take a look at the docs for Hash.

It is sounding like you want a representation like:

[{ :team_id => 1, :desc_of_array_2 => 42.14, :desc_of_array_4 =>
18.97 },
{ :team_id => 2, :desc_of_array_2 => 97.32, :desc_of_array_4 =>
49.97 },

]

or even a hash that maps team_id to its set of stats like:

{ 1 => { :desc_of_array_2 => 42.14, :desc_of_array_4 => 18.97 },
2 => { :desc_of_array_2 => 97.32, :desc_of_array_4 => 49.97 },

}

This collection kinda looks like an array when accessed because the
method is [] for both Array and Hash. If you want the array 12 value
for team 87, you’d have (assuming that the hash is in a variable
called stats):

stats[87][:desc_of_array_12]

I’m assuming that you’d have more “natural” names for
the :desc_of_array_N

Note that I’m using :symbols, but you could use ‘strings’ instead.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Hi Rob,

This is what I was looking for as a good start:

I had to use gist.github.com because pastie was not responding.

As you can see, it did exactly what I was trying to get my organizing
the multiple arrays into one large array of arrays…

Now I just have to figure out how to pass this to the update table and
iterate over it.

I’ll mull it over and try to post some code if I have some issues from
here…

Many thanks!

From Rake, I believe all I need to do is this:

update_tsos_offense.table_update(TsosOffense, stats)

Which will continue with the open object, call the table_update method
in TsosOffense model, and pass stats which holds the array.

Then, in the TsosOffense model tabel_update method, I can iterate over
the stats array passed…

Still some work to do but I believe it should be as simple as this…

Correct me if I’m wrong.

Thanks,

On Jul 17, 2009, at 6:33 PM, Rob B. wrote:

team id and 14 ratings for even-numbered teams. By iterating this

array 1 holds team_id for table one.
Notice that array 1 and array 3 which hold the team IDs do not

over 120 rows…

stats[87][:desc_of_array_12]

I’m assuming that you’d have more “natural” names for
the :desc_of_array_N

Note that I’m using :symbols, but you could use ‘strings’ instead.

-Rob

Just saw your other thread on the ruby list, but I’ll answer here, too.

Put this at the end of your pastie and see if it helps you see.

referencing a new key causes an empty has to be stored as the value

stats = Hash.new {|h,k| h[k] = {} }

i will take on the same values as 0.upto(119)

120.times do |i|
stats[to_team_id[i]][:to] = to_ppcs[i]
stats[ro_team_id[i]][:ro] = ro_ppcs[i]
stats[po_team_id[i]][:po] = po_ppcs[i]
stats[so_team_id[i]][:so] = so_ppcs[i]
stats[rzo_team_id[i]][:rzo] = rzo_ppcs[i]
stats[flo_team_id[i]][:flo] = flo_ppcs[i]
stats[pio_team_id[i]][:pio] = pio_ppcs[i]
stats[too_team_id[i]][:too] = too_ppcs[i]
stats[sao_team_id[i]][:sao] = sao_ppcs[i]
stats[tflo_team_id[i]][:tflo] = tflo_ppcs[i]
stats[peo_team_id[i]][:peo] = peo_ppcs[i]
stats[fdo_team_id[i]][:fdo] = fdo_ppcs[i]
stats[tdco_team_id[i]][:tdco] = tdco_ppcs[i]
stats[fdco_team_id[i]][:fdco] = fdco_ppcs[i]
end

puts stats.inspect

or this might be easier to read

require ‘pp’
pp stats

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Hi guys/gals,

Okay some good news. I’m able to pass the information to the correct
model and inspect it. I just don’t know how to iterate through this
type of array. As it contains a hash setup, I’m not as experienced with
this piece. Could someone give me some pointers on how to iterate
through this data in my update_table method?

Here’s what I have so far:

def table_update(model, constant, array)
#puts array.inspect
if model.compiled_this_week.find(:all).empty?
puts “Updating #{model} for the following teams:”
array.each do |row|
values = {:compiled_on => Date.today.strftime(’%Y-%m-%d’)}
constant.each_with_index do |field, i|
values[field] = row[i]
end
model.create values
end
else
# data is already populated for the week so don’t update
puts “Current Week’s Ratings are Already updated!”
end
end

compiled_this_week is just a scope that checks for between dates and I’m
finding out if the table is empty. If it is empty for the specified
dates (current week basically) I populate the table with new data…

“constant” refers to a constant I have setup in my environment.rb file
which houses the fields I’m going to populate in the table:

TSOS_OFFENSE = [:team_id, :totoff, :rushoff,
:passoff, :scoroff, :rzonoff, :fumlost, :passhint,
:tolost, :sacksall, :tackflossall, :passeff,
:firdwns, :thrdwncon, :fthdwncon]

“array” refers to the stats that were pulled in rake. They look and
appear just like the following if I perform a puts array.inspect:

{18=>{:tolost=>15, :passoff=>195.5, :fthdwncon=>44.44,
:sacksall=>2.42, :scoroff=>27.0, :tackflossall=>6.08,
:rzonoff=>0.85, :passeff=>130.13, :fumlost=>7,
:firdwns=>19.83, :totoff=>398.83, :passhint=>8,
:thrdwncon=>37.34, :rushoff=>203.33}

55=>{:tolost=>17, :passoff=>121.5, :fthdwncon=>40.0,
:sacksall=>2.42, :scoroff=>18.08, :tackflossall=>4.38,
:rzonoff=>0.91, :passeff=>94.95, :fumlost=>9,
:firdwns=>13.67, :totoff=>270.17, :passhint=>8,
:thrdwncon=>28.85, :rushoff=>148.67}

etc…

I just don’t know how to iterate through this type of array. As you can
see the array houses the exact names of my constant…

Any help would be appreciated…

Rob,

I think this is probably the better way using your example:

RAKE

update_tsos_offense.table_update(TSOS_OFFENSE, stats) # constant, #
array

MODEL

def self.table_update(constant, array)
if compiled_this_week.find(:all).empty?
array.each do |row|
values = {:compiled_on => Date.today.strftime(’%Y-%m-%d’)}
constant.each_with_index do |field, i|
values[field] = row[i]
end
self.create values
end
else
# data is already populated for the week so don’t update
puts “Current Week’s Ratings are Already updated!”
end
end

That way I send a constant holding the fields that will be populated in
the table, the array (stats) which contains the data.

I haven’t tested this yet - but I think I’m close…

Okay, after testing and testing, I finally managed to get it all to
work. However, I’m sure my way is very clumsily implemented but it was
the only way I understood how to read the values and place them into the
table.

I called the following from Rake:

update_tsos_offense.table_update(TsosOffense, stats) # model, # array

And in the model for table_update I did:

def table_update(model, array)

if model.compiled_this_week.find(:all).empty?
puts “Updating #{model} for the following teams:”
120.times do |i|
team = Team.find(i + 1)
values = {:compiled_on => Date.today.strftime(’%Y-%m-%d’)}
values[:team_id] = i + 1
values[:totoff] = array[i + 1][:totoff]
values[:rushoff] = array[i + 1][:rushoff]
values[:passoff] = array[i + 1][:passoff]
values[:scoroff] = array[i + 1][:scoroff]
values[:rzonoff] = array[i + 1][:rzonoff]
values[:fumlost] = array[i + 1][:fumlost]
values[:passhint] = array[i + 1][:passhint]
values[:tolost] = array[i + 1][:tolost]
values[:sacksall] = array[i + 1][:sacksall]
values[:tackflossall] = array[i + 1][:tackflossall]
values[:passeff] = array[i + 1][:passeff]
values[:firdwns] = array[i + 1][:firdwns]
values[:thrdwncon] = array[i + 1][:thrdwncon]
values[:fthdwncon] = array[i + 1][:fthdwncon]
model.create values
puts “#{team.name} values are being saved.”
end
else
# data is already populated for the week so don’t update
puts “Current Week’s Ratings are Already updated!”
end
end

I had to add 1 because the i count started at 0. I also couldn’t use
the constant or iterate using each or each_with_index because one, I
couldn’t get it to sort.

This way does work though and so I’m happy that it at least is
functioning. Although, I’m sure it can use some cleanup.