Refactoring and Improving my Rake Task

Here’s my rake task, I’m quite proud of it it’s all working, but it’s
going to be really important to my rails app. Especially the fastercsv
bit where I read in all the records and create activerecord objects -
it’s not very flexible at the moment and I’m not certain the best way to
make it more flexible, for example if I want ot use different fields.

If anyone has time I’d really value a general critique and any tips.
especially for improving the resorts:load task.

best

BB.

namespace :peaklocation do

desc “Count the resorts in the database”
task :database_report => :environment do
puts
records = Resort.find(:all)
puts "Resorts : " + records.length.to_s
records = Advert.find(:all)
puts "Adverts : " + records.length.to_s
records = User.find(:all)
puts "Users : " + records.length.to_s
records = User.find(:all)
count = 0
for user in records
if user.admin
count += 1
end
end
puts "Admins : " + count.to_s
end

Resort Tasks

namespace :resorts do

# note that the following command might be needed on the data file
# tr -d '\n' < peaklocation/db/resorts.csv >

peaklocation/db/resorts1.csv

desc "Load up all the resorts from resorts1.csv into #{Rails.env}

database"
task :load => :environment do
require ‘fastercsv’
count=0
FasterCSV.foreach("#{RAILS_ROOT}/db/resorts1.csv", :row_sep =>
“\r”) do |row|
# name,serial_no,region = row
Resort.create(:name => row[0],:serial_number =>
row[1],:country => row[7],:resort_height => row[16],:overview_review =>
row[141],:region => row[6],:continent => row[8],:coordinates_east =>
row[10],:coordinates_north => row[11],:travel_to_review => row[146])
count += 1
# puts count.to_s + " Resort name: " + row[0]
puts "Resort created : " + row[0]
end
puts “-------------------------”
puts "Resorts created : " + count.to_s
puts “-------------------------”

  Rake::Task["peaklocation:database_report"].invoke

end

desc "Delete all the resorts from the #{Rails.env} database"
task :unload => :environment do

  Resort.delete_all

  Rake::Task["peaklocation:database_report"].invoke



end

desc "List all the resorts from the database"
task :list => :environment do
  resorts = Resort.find(:all)
  for resort in resorts
    puts resort.id.to_s + ' ' + resort.name
  end

  Rake::Task["peaklocation:database_report"].invoke


end

end

Advert Tasks

namespace :adverts do
desc “Load up all the advert_types into #{Rails.env} database”
task :load_advert_types => :environment do
AdvertType.create!(:name => “Accommodation”, :description =>
“Accom”)
AdvertType.create!(:name => “Food”, :description => “Food”)
AdvertType.create!(:name => “Ski Hire”, :description => “Ski
Hire”)
end
end

User Tasks

namespace :users do
desc “Make a user admin in #{Rails.env} database”
task :make_admin => :environment do
username = ENV[‘USERNAME’] || ENV[‘username’]
raise “Must specify USERNAME=” unless username
u = User.find_by_username(username)
if u
puts “#{u.username} found”
u.admin = 1
puts “#{u.username} set to admin”
u.save
puts “#{u.username} saved”
else
puts “#{username} not found”
end
end
end

end

Michael,

That’s superb, like Christmas has come early! I’m going to review all
your improvements to understand them and then implement, learned a LOT
here. Thanks, will feedback.

Follow up question. How would you handle the migration file in my case
(for the CSV import I have to create the fields in advance of course via
the migration). Currently I rather boringly hand type them. Am wondering
if it’s possible to read the first row of the CSV (headers) and maybe
create all the fields in one go ! I suppose I would have to have them as
all the same data type e.g. string. Maybe not a smart move.

current migration like this…


class CreateResorts < ActiveRecord::Migration
def self.up
create_table :resorts do |t|

   t.string :name
   t.string :serial_number
   t.string :country

   t.string :region
   t.string :continent

   t.string :coordinates_east
   t.string :coordinates_north

   t.text :overview_review
   t.text :travel_to_review

   t.integer :resort_height

  t.timestamps
end

end

def self.down
drop_table :resorts
end
end


Would appreciate your thoughts - the CSV import rake task IS totally
what this app is about.

Best

bb

bingo bob wrote:

Here’s my rake task, I’m quite proud of it it’s all working, but it’s
going to be really important to my rails app. Especially the fastercsv
bit where I read in all the records and create activerecord objects -
it’s not very flexible at the moment and I’m not certain the best way to
make it more flexible, for example if I want ot use different fields.

If anyone has time I’d really value a general critique and any tips.
especially for improving the resorts:load task.

best

BB.

namespace :peaklocation do

desc “Count the resorts in the database”
task :database_report => :environment do
puts
records = Resort.find(:all)
puts "Resorts : " + records.length.to_s
records = Advert.find(:all)
puts "Adverts : " + records.length.to_s
records = User.find(:all)
puts "Users : " + records.length.to_s
records = User.find(:all)
count = 0
for user in records
if user.admin
count += 1
end
end
puts "Admins : " + count.to_s
end

Resort Tasks

namespace :resorts do

# note that the following command might be needed on the data file
# tr -d '\n' < peaklocation/db/resorts.csv >

peaklocation/db/resorts1.csv

desc "Load up all the resorts from resorts1.csv into #{Rails.env}

database"
task :load => :environment do
require ‘fastercsv’
count=0
FasterCSV.foreach(“#{RAILS_ROOT}/db/resorts1.csv”, :row_sep =>
“\r”) do |row|
# name,serial_no,region = row
Resort.create(:name => row[0],:serial_number =>
row[1],:country => row[7],:resort_height => row[16],:overview_review =>
row[141],:region => row[6],:continent => row[8],:coordinates_east =>
row[10],:coordinates_north => row[11],:travel_to_review => row[146])
count += 1
# puts count.to_s + " Resort name: " + row[0]
puts "Resort created : " + row[0]
end
puts “-------------------------”
puts "Resorts created : " + count.to_s
puts “-------------------------”

  Rake::Task["peaklocation:database_report"].invoke

end

desc "Delete all the resorts from the #{Rails.env} database"
task :unload => :environment do

  Resort.delete_all

  Rake::Task["peaklocation:database_report"].invoke



end

desc "List all the resorts from the database"
task :list => :environment do
  resorts = Resort.find(:all)
  for resort in resorts
    puts resort.id.to_s + ' ' + resort.name
  end

  Rake::Task["peaklocation:database_report"].invoke


end

end

Advert Tasks

namespace :adverts do
desc “Load up all the advert_types into #{Rails.env} database”
task :load_advert_types => :environment do
AdvertType.create!(:name => “Accommodation”, :description =>
“Accom”)
AdvertType.create!(:name => “Food”, :description => “Food”)
AdvertType.create!(:name => “Ski Hire”, :description => “Ski
Hire”)
end
end

User Tasks

namespace :users do
desc “Make a user admin in #{Rails.env} database”
task :make_admin => :environment do
username = ENV[‘USERNAME’] || ENV[‘username’]
raise “Must specify USERNAME=” unless username
u = User.find_by_username(username)
if u
puts “#{u.username} found”
u.admin = 1
puts “#{u.username} set to admin”
u.save
puts “#{u.username} saved”
else
puts “#{username} not found”
end
end
end

end

I refactored your tasks a bit. I moved the guts of most of the tasks to
class methods on the seemingly appropriate class. When generating the
report I removed the use of String#+ and replaced it with interpolation
#{}. I also used ActiveRecord::Base.count instead of .find and then
.length on that. I tried to make some comments and suggestions.

class Database::Report
def self.to_s
puts
resort_count = Resort.count
puts “Resorts : #{resort_count}”
advert_count = Advert.count
puts “Adverts : #{advert_count}”
user_count = User.count
puts “Users : #{user_count}”
users = User.find(:all)
admin_count = 0
users.each do |user|
if user.admin
count += 1
end
end
# couldn’t you use raw sql here?
# admin_count = User.count(:conditions => [“admin = ?”, true])

puts "Admins  : #{admin_count}"

end
end

require ‘fastercsv’

class Resort
def self.import_from_csv(filename)
count=0

# wrap in transaction in the event one of the resort's fails
# validation and raises an exception
Resort.transaction do
  # I see you were removing the newline characters with tr, I
  # changed your row_sep to "\r\n" thinking that might help.
  # However if that does not work I would load the string up in
  # ruby and perform the transformation with a regexp, then pass
  # it off the the appropriate FasterCSV method for a string.
  FasterCSV.foreach(filename, :row_sep => "\r\n") do |row|
    # name,serial_no,region = row
    # use ! (bang) version of create to throw exception if
    # validation fails, helpful with transactions
    Resort.create!(:name => row[0],
                   :serial_number => row[1],
                   :country => row[7],
                   :resort_height => row[16],
                   :overview_review => row[141],
                   :region => row[6],
                   :continent => row[8],
                   :coordinates_east => row[10],
                   :coordinates_north => row[11],
                   :travel_to_review => row[146])
    count += 1
    puts count.to_s + "#{count} Resort name: #{row[0]}"
    puts "Resort created : " + row[0]
  end
end
puts "-------------------------"
puts "Resorts created     : #{count}"
puts "-------------------------"

end
end

namespace :peaklocation do

desc “Count the resorts in the database”
task :database_report => :environment do
puts Database::Report
end

namespace :resorts do

desc "Load up all the resorts from resorts1.csv into #{Rails.env} 

database"
task :load => :environment do
filename = ENV[‘FILENAME’] || “#{RAILS_ROOT}/db/resorts1.csv”
Resort.import_from_csv(filename)
Rake::Task[“peaklocation:database_report”].invoke
end

desc "Delete all the resorts from the #{Rails.env} database"
task :unload => :environment do
  Resort.delete_all
  Rake::Task["peaklocation:database_report"].invoke
end

desc "List all the resorts from the database"
task :list => :environment do
  Resort.find(:all).each do |resort|
    puts "#{resort.id} #{resort.name}"
  end

  Rake::Task["peaklocation:database_report"].invoke
end

end

namespace :adverts do
desc “Load up all the advert_types into #{Rails.env} database”
task :load_advert_types => :environment do
adverts = [[“Accomodation”, “Accom”],
[“Food”, “Food”],
[“Ski Hire”, “Ski Hire”],
].each do |name, description|
AdvertType.create!(:name => name, :description => description)
end
end
end

namespace :users do
desc “Make a user admin in #{Rails.env} database”
task :make_admin => :environment do
username = ENV[‘USERNAME’] || ENV[‘username’]
raise “Must specify USERNAME=” unless username
user = User.find_by_username(username)
if user
puts “#{user.username} found”
u.admin = 1
puts “#{user.username} set to admin”
u.save
puts “#{user.username} saved”
else
puts “#{username} not found”
end
end
end
end

Best,
Michael G.

Oh, one other thing. Moving stuff to class methods makes sense I guess
as it makes the rakefile cleaner an I guess it’s just the right thing to
do.

Where do I put stuff though and what syntax. I mean so I have a
resort.rb in app/models/resort.rb so I guess that’s where the fastercsv
import method goes. What about he database report method, where can I
put that?

Cheers,

BB

bingo bob wrote:

Oh, one other thing. Moving stuff to class methods makes sense I guess
as it makes the rakefile cleaner an I guess it’s just the right thing to
do.

Where do I put stuff though and what syntax. I mean so I have a
resort.rb in app/models/resort.rb so I guess that’s where the fastercsv
import method goes. What about he database report method, where can I
put that?

Just stick the import_from_csv method in the resort model (I assume you
already have this file). Database::Report was just a lame example of
one way to approach it. You could put the code in any class you feel
most appropriate. If you wanted to use it as is, it could live at
app/models/database/report.rb or lib/database/report.rb and either
location would work with Rails dependency loading / mapping.

Best,
Michael G.

Hi Michael,

May thanks for this, just so you know, finding this very useful.

Just a few things.

  1. What’s the general rationale for moving stuff out of the rake task
    and into the heart of the app/model methods or lib dir - I guess it
    leaves the rake task a lot cleaner really - fine with it, just wondering
    about the rationale best practice. Makes the code useable throughout the
    app?

  2. A mighty thanks for the FasterCSV.foreach(filename, :row_sep =>
    “\r\n”) code, works perfectly, as you say I was having to “pre treat” my
    CSV files beforehand, an extra and unnecessary step, this works a treat.
    This code is now moved to the resort model in app/model/resort.rb, works
    fine!

  3. Interpolation for resorts within the strong, rather than put + ’ ’ +
    ’ ’ + etc, is more elegant I imagine. Done.

  4. Using the bang version of create, very useful to throw exceptions as
    you say.

  5. If you get a chance to comment on the migration problem that’d be
    useful - maybe I shouldn’t worry about it. I was thinking to have the
    migration be coded in pseudo-code such that.

  • read the first row of the csv
    (I’ll put data types in here, e.g. string, text, boolean)

  • read the second row of the csv for the field names
    (e.g. name, serial_number, country)

  • generate code such that the migration is created almost dynamically
    like this t.[data type goes here] :[field name goes here]

However - maybe I’m overcomplicating matters and it’s better just to do
the migration by hand just including the required fields.

  1. I moved the database report to lib/database/report.rb, kind of worked
    but I’m a little confused, what happened when I called it from the rake
    task was that I got the output but on the end I had a “Class #223jjk” <-
    somehting like that, listed on the end of the output. I got rid of it by
    commenting out the def self.to_s and end statement, but I’m sure this is
    incorrect. I was calling it from the rake task with simply puts
    Database::Report.

bingo bob wrote:

Hi Michael,

May thanks for this, just so you know, finding this very useful.

Just a few things.

  1. What’s the general rationale for moving stuff out of the rake task
    and into the heart of the app/model methods or lib dir - I guess it
    leaves the rake task a lot cleaner really - fine with it, just wondering
    about the rationale best practice. Makes the code useable throughout the
    app?

Re-usability and testability are the biggest factors in keeping this
code in the corresponding class.

  1. A mighty thanks for the FasterCSV.foreach(filename, :row_sep =>
    “\r\n”) code, works perfectly, as you say I was having to “pre treat” my
    CSV files beforehand, an extra and unnecessary step, this works a treat.
    This code is now moved to the resort model in app/model/resort.rb, works
    fine!

  2. Interpolation for resorts within the strong, rather than put + ’ ’ +
    ’ ’ + etc, is more elegant I imagine. Done.

  3. Using the bang version of create, very useful to throw exceptions as
    you say.

  4. If you get a chance to comment on the migration problem that’d be
    useful - maybe I shouldn’t worry about it. I was thinking to have the
    migration be coded in pseudo-code such that.

  • read the first row of the csv
    (I’ll put data types in here, e.g. string, text, boolean)

  • read the second row of the csv for the field names
    (e.g. name, serial_number, country)

  • generate code such that the migration is created almost dynamically
    like this t.[data type goes here] :[field name goes here]

However - maybe I’m overcomplicating matters and it’s better just to do
the migration by hand just including the required fields.

I would keep the migration just that, very simple. No need to reinvent
the wheel by adding extra pieces of data to your CSV.

  1. I moved the database report to lib/database/report.rb, kind of worked
    but I’m a little confused, what happened when I called it from the rake
    task was that I got the output but on the end I had a “Class #223jjk” <-
    somehting like that, listed on the end of the output. I got rid of it by
    commenting out the def self.to_s and end statement, but I’m sure this is
    incorrect. I was calling it from the rake task with simply puts
    Database::Report.

Again, the whole Database::Report thing was just a poor example. Just
stick that code in whichever place you like.

Best,
Michael G.