Forum: Ruby on Rails initialization of new model objects in Rails

Posted by Dan Brooking (Guest)
on 2012-12-19 22:48
(Received via mailing list)
Hey guys... having an issue that hopefully someone can help with.

I have a ActiveRecord model called WebPage, it has two fields - url and
title.  I want the title to be determined by parsing the html (I'm using
Nokogiri) - and this is where I'm having issues.

My code looks something like:

class WebPage < ActiveRecord::Base
  attr_accessible :url, :title, :doc

  def doc
    @doc ||= Nokogiri::HTML(open(@url))
  end

  def title
    title = @doc.css('title')
  end
end


What is happening, is if I run this and try and create a new object 
(@page
= WebPage.new(@url), I get:
ActionView::Template::Error (undefined method `css' for nil:NilClass)

Now, if I set @doc in my controller, and change the name of title to
set_title and call @page.title = @page.set_title, it works.  But that is
very ugly and if I've learned anything from rails, is that if it looks 
ugly
to start with, it's probably not the right way.

What am I doing wrong?
Posted by Walter Davis (walterdavis)
on 2012-12-19 23:39
(Received via mailing list)
On Dec 19, 2012, at 4:46 PM, Dan Brooking wrote:

>     @doc ||= Nokogiri::HTML(open(@url))
>
> Now, if I set @doc in my controller, and change the name of title to set_title 
and call @page.title = @page.set_title, it works.  But that is very ugly and if 
I've learned anything from rails, is that if it looks ugly to start with, it's 
probably not the right way.
>
> What am I doing wrong?
>

You're not calling your doc() method when you use the @doc instance 
variable, so you aren't initializing @doc. I believe you could fix this 
by removing the @ before doc in your title method. You're still going to 
get the benefit of "memoizing" the value (you'll only look it up once, 
no matter how many times you ask for it).

One other thing to consider here. doc.css will always return an 
array-like NodeSet rather than a Node. You can do one of two things: 
doc.css('title').first() or doc.at_css('title'), which will do the same 
thing. And then if you want the content of the title, you need to say 
so. Otherwise, you will get a Node, and if it gives you the string 
contents of itself as a return value, that's pure coincidence.

title = doc.at_css('title').content

Walter
Posted by Dan Brooking (Guest)
on 2012-12-20 02:35
(Received via mailing list)
Wow... you have no idea how long I've been staring at this.  That's 
exactly
what it was.. changed "@doc" to "doc" and that did it.

I do have some code for parsing out the NodeSet.... just removed it for
simplicity in my code.  However, my code isn't as clean as what you have 
so
I'll play around with it some.

Thanks a bunch!
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.