Need help for Nokogiri XML parser

Hi,

I am facing issue with the Nokogiri XML Parser:

I am using the following code to parse:

doc = Nokogiri::HTML.fragment(xml)
puts doc.to_xml

Output:

-18
0
960
720
0px

%% url-97255 %% 000000 https://example.com/some.jpg Opening Still url-97255 _self

Ideally the link tag should be “%% url-97255 %%”.

Any when I use:
doc = Nokogiri::XML(xml)
puts doc.to_xml

That time the HTML entities is not parsing correctly:

331
183
508
44

false
000000
false
P ALIGN=LEFTFONT FACE=Arial SIZE=24 COLOR=#CC0033
LETTERSPACING=0 KERNING=0Thi - Creativs ise/FONT/P
0px
000000

Instead I was hoping that I will get output something like:
<P ALIGN=“LEFT”><FONT FACE=“Arial”
SIZE=“24”
COLOR="#CC0033" LETTERSPACING=“0” KERNING=“0”>Membership
Rewards</FONT></P>

Need help

Thank you,
Abhishek S.

On Wednesday, February 25, 2015 at 9:40:39 AM UTC, [email protected]
wrote:

Hi,

I am facing issue with the Nokogiri XML Parser:

I am using the following code to parse:

doc = Nokogiri::HTML.fragment(xml)
puts doc.to_xml

What is the input you’re feeding it? if the input is malformed, then
nokogiri will have to guess at how to fix it.

Fred