Forum: Ruby on Rails Mechanize out of buffer space

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
3f91cf60c92b20940674ebdeb46f6582?d=identicon&s=25 wbsmith83@gmail.com (Guest)
on 2007-01-26 15:16
(Received via mailing list)
I am trying to scrape a site and then its children to get data I relate
in tables, the only problems is that I keep getting an "OUT OF BUFFER
SPACE" error. Is there a way to clear the buffer after each iteration
or am I doing something wrong?

Here's the code:
require 'rubygems'
require 'mechanize'
require 'active_record'

ActiveRecord::Base.establish_connection(
#connection goes here
)

class Major < ActiveRecord::Base
  has_many :courses
end

class Course < ActiveRecord::Base
  belongs_to :major
end

class Sections
  def scrape(url)
    agent = WWW::Mechanize.new
    page = agent.get(url)
    table = (page/'//table')[6]
    (table/"tr").each do |major|
      @newMajor = Major.new
      @newMajor.title = (major/'//td').first.inner_html
      @newMajor.abbrev = (major/'acronym').inner_html
      @newMajor.link_to = (major/'a').to_s.split('"')[1]
      puts title,abbrev,link_to
    end
  end
end

class Classes
  attr_writer :major_id
  def scrape(url)
    agent = WWW::Mechanize.new
    page = agent.get("http://courses.tamu.edu/"+url.to_s)
    (page/"//td[@class='sectionheading']").each do |course|
      course = course.inner_html.strip.split(' ')
      course.pop
      @newCourse = Course.new
      @newCourse.major_id = @major_id
      @newCourse.course_no = course[1]
      @newCourse.name = course.slice!(3,course.length).join(' ')
      @newCourse.save
    end
  end
end

AllMajors = Major.find(:all)
AllMajors.each do |course|
  start = Time.now
  newClass = Classes.new
  newClass.major_id = course.id
  newClass.scrape(course.link_to)
  puts "Added courses for #{course.title}"
  finish = Time.now
  puts "Took #{finish-start} seconds"
end
puts "Finished scraping courses"
3f91cf60c92b20940674ebdeb46f6582?d=identicon&s=25 wbsmith83@gmail.com (Guest)
on 2007-01-26 18:10
(Received via mailing list)
After having to delve into the actual Hpricot source it turns out
there's a predefined buffer size and you can't change it without
actually editing the source and recompiling.
This topic is locked and can not be replied to.