Hey guys... having an issue that hopefully someone can help with. I have a ActiveRecord model called WebPage, it has two fields - url and title. I want the title to be determined by parsing the html (I'm using Nokogiri) - and this is where I'm having issues. My code looks something like: class WebPage < ActiveRecord::Base attr_accessible :url, :title, :doc def doc @doc ||= Nokogiri::HTML(open(@url)) end def title title = @doc.css('title') end end What is happening, is if I run this and try and create a new object (@page = WebPage.new(@url), I get: ActionView::Template::Error (undefined method `css' for nil:NilClass) Now, if I set @doc in my controller, and change the name of title to set_title and call @page.title = @page.set_title, it works. But that is very ugly and if I've learned anything from rails, is that if it looks ugly to start with, it's probably not the right way. What am I doing wrong?
on 2012-12-19 22:48
on 2012-12-19 23:39
On Dec 19, 2012, at 4:46 PM, Dan Brooking wrote: > @doc ||= Nokogiri::HTML(open(@url)) > > Now, if I set @doc in my controller, and change the name of title to set_title and call @page.title = @page.set_title, it works. But that is very ugly and if I've learned anything from rails, is that if it looks ugly to start with, it's probably not the right way. > > What am I doing wrong? > You're not calling your doc() method when you use the @doc instance variable, so you aren't initializing @doc. I believe you could fix this by removing the @ before doc in your title method. You're still going to get the benefit of "memoizing" the value (you'll only look it up once, no matter how many times you ask for it). One other thing to consider here. doc.css will always return an array-like NodeSet rather than a Node. You can do one of two things: doc.css('title').first() or doc.at_css('title'), which will do the same thing. And then if you want the content of the title, you need to say so. Otherwise, you will get a Node, and if it gives you the string contents of itself as a return value, that's pure coincidence. title = doc.at_css('title').content Walter
on 2012-12-20 02:35
Wow... you have no idea how long I've been staring at this. That's exactly what it was.. changed "@doc" to "doc" and that did it. I do have some code for parsing out the NodeSet.... just removed it for simplicity in my code. However, my code isn't as clean as what you have so I'll play around with it some. Thanks a bunch!