Forum: Ruby Problems with scRUBYt

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
666bc1afacbfb59f8e511faad9f31e64?d=identicon&s=25 Cs Webgrl (cswebgrl)
on 2008-12-31 00:33
Hi.

I am currently scraping a page with scRUBYt and am not getting the
results as expected.

Instead of the correctly formatted xml document I'm getting the
following.

<record>
  <1>book a</1>
  <1>book b</1>
  <1>book c</1>
  <2>chapter aa</2>
  <2>chapter bb</2>
  <2>chapter cc</2>
  <3>verse aaa</3>
  <3>verse bbb</3>
  <3>verse ccc</3>
</record>

My code looks like this:

listing "//a[@id*='volume'>" do
 book "//a[@class='1']"
 chapter "//span[@class='2']"
 verse "//a[@id*='3']"
end

Any ideas?

Sorry for the sample data, but hopefully someone has seen this before
and can help.
Be30361bb0b0c495e3077db43ad84b56?d=identicon&s=25 Aaron Patterson (Guest)
on 2008-12-31 01:38
(Received via mailing list)
On Wed, Dec 31, 2008 at 08:33:26AM +0900, Cs Webgrl wrote:
>   <1>book b</1>
>   <1>book c</1>
>   <2>chapter aa</2>
>   <2>chapter bb</2>
>   <2>chapter cc</2>
>   <3>verse aaa</3>
>   <3>verse bbb</3>
>   <3>verse ccc</3>
> </record>

This is a correctly formatted XML document.  You just have numbers for
tag names.

> My code looks like this:
>
> listing "//a[@id*='volume'>" do
>  book "//a[@class='1']"
>  chapter "//span[@class='2']"
>  verse "//a[@id*='3']"
> end
>
> Any ideas?

Have you tried something like this:

  book "//2[@id='whatevs']"

That should get you access to the tags.

Hope that helps!
666bc1afacbfb59f8e511faad9f31e64?d=identicon&s=25 Cs Webgrl (cswebgrl)
on 2008-12-31 02:02
Aaron Patterson wrote:

>
> Have you tried something like this:
>
>   book "//2[@id='whatevs']"
>
> That should get you access to the tags.


This gives me a ton of data, but now I have lost the specific pieces of
information that I'm looking for.  Instead it looks like the output of
all of the sourced code on that page.  Was I to change something else in
the code to get the specific piece of data that I need?
This topic is locked and can not be replied to.