Parsing XML file with no style info with Hpricot

Hello,

I’ve been trying for hours to parse an XML using Hpricot. Usually it’s
not a problem. Here’s my simple code:

#This works and outputs the proper xml data
@url1 = ‘http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml
@page1 = Hpricot(open(@url1))
<%= @page 1 %>

#This does not work, and I’m scratching my head
@url1 =
http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml
@page1 = Hpricot(open(@url1))
<%= @page 1 %>

The gd2.mlb.com XML file does not have any style information according
to Firefox. I can read it using Oxygen. Can somebody provide me with a
hint on how to parse the mlb.com XML? Thanks!

-A

Any idea how to parse this XML?

-A

Allan L. wrote:

Hello,

I’ve been trying for hours to parse an XML using Hpricot. Usually it’s
not a problem. Here’s my simple code:

#This works and outputs the proper xml data
@url1 = ‘http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml
@page1 = Hpricot(open(@url1))
<%= @page 1 %>

#This does not work, and I’m scratching my head
@url1 =
http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml
@page1 = Hpricot(open(@url1))
<%= @page 1 %>

The gd2.mlb.com XML file does not have any style information according
to Firefox. I can read it using Oxygen. Can somebody provide me with a
hint on how to parse the mlb.com XML? Thanks!

-A

On Sun, Mar 7, 2010 at 4:10 AM, Allan L. [email protected] wrote:

#This does not work, and I’m scratching my head

And I’m scratching mine trying to guess what you mean by “does not
work” …


Hassan S. ------------------------ [email protected]
twitter: @hassan

Hpricot is not parsing the MLB xml file. I’m thinking the reason that it
is not reading the MLB xml file is because it is not in a standard XML
format.

If you give my code a quick try, you’ll notice that it will read other
XML files, but not the MLB XML.

#This works and outputs the proper xml data
@url1 = ‘http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml
@page1 = Hpricot(open(@url1))
<%= @page1 %>

#This does not work, and I’m scratching my head
@url1 =
http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml
@page1 = Hpricot(open(@url1))
<%= @page1 %>

Hassan S. wrote:

On Sun, Mar 7, 2010 at 4:10 AM, Allan L. [email protected] wrote:

#This does not work, and I’m scratching my head

And I’m scratching mine trying to guess what you mean by “does not
work” …


Hassan S. ------------------------ [email protected]
twitter: @hassan

On Sun, Mar 7, 2010 at 9:39 AM, Allan L. [email protected] wrote:

If you give my code a quick try, you’ll notice that it will read other
XML files, but not the MLB XML.

Actually, I already did, and it seems to work just fine. Hence my own
head-scratching. :slight_smile:

So, again, maybe you can say exactly what you expect to happen
and how that differs from what you’re seeing.


Hassan S. ------------------------ [email protected]
twitter: @hassan

On Mar 7, 6:18 pm, Allan L. [email protected] wrote:

I’m expecting the XML information seen here on Firefox:http://picasaweb.google.com/lh/photo/X7VFocR3L4S4Pl_2jvDzVQ?feat=dire

to be displayed when I parse the MLB file. Hpricot is not parsing this
file.

Have you tried viewing the source of the page generated by your view?
I suspect hpricot is parsing the file but just blatting it into the
view like that is producing invalid html which your browser is not
rendering.

Fred

Hi Hassan,

This picture:
http://picasaweb.google.com/lh/photo/Qf4DFta9p5ERoCRb6Lbd2Q?feat=directlink

This is the parsed output from the feed from the sportingnews XML file.
It
is displayed on my view with <%= @page1 %>.

This picture:
http://picasaweb.google.com/lh/photo/xLVr8_U-x12rJnADs_qcEw?feat=directlink

The blank space what is displayed on the view with <%= @page1 %> using
the MLB XML file.

I’m expecting the XML information seen here on Firefox:
http://picasaweb.google.com/lh/photo/X7VFocR3L4S4Pl_2jvDzVQ?feat=directlink

to be displayed when I parse the MLB file. Hpricot is not parsing this
file.

-A

Hassan S. wrote:

On Sun, Mar 7, 2010 at 9:39 AM, Allan L. [email protected] wrote:

If you give my code a quick try, you’ll notice that it will read other
XML files, but not the MLB XML.

Actually, I already did, and it seems to work just fine. Hence my own
head-scratching. :slight_smile:

So, again, maybe you can say exactly what you expect to happen
and how that differs from what you’re seeing.


Hassan S. ------------------------ [email protected]
twitter: @hassan

Thanks everybody. I saw the info on the source. I figured it out.

-A

Hassan S. wrote:

On Sun, Mar 7, 2010 at 10:18 AM, Allan L. [email protected]
wrote:

I’m expecting the XML information seen here on Firefox:
/
to be displayed when I parse the MLB file. Hpricot is not parsing this
file.

Sure it is – use irb to examine what’s in @page1.

As Frederick already suggested, you apparently have a view problem,
not an Hpricot parsing problem.


Hassan S. ------------------------ [email protected]
twitter: @hassan

On Sun, Mar 7, 2010 at 10:18 AM, Allan L. [email protected]
wrote:

I’m expecting the XML information seen here on Firefox:
/
to be displayed when I parse the MLB file. Hpricot is not parsing this
file.

Sure it is – use irb to examine what’s in @page1.

As Frederick already suggested, you apparently have a view problem,
not an Hpricot parsing problem.


Hassan S. ------------------------ [email protected]
twitter: @hassan