Nokogiri not returning attribute value verbatim

Cant seem to find an answer to this on google:

If I have this value as the text within an attribute in my xml source:
“a2/PP00nFwWa7I8Jog7bcw==\n”

When I ask Nokogiri to return it, why does it return this:
"a2/PP00nFwWa7I8Jog7bcw== " (the last character I confirmed in the
debugger
as a space character). So it seems Nokogiri is converting the “\n” to a
space.

Is there a way to tell Nokogiri to return verbatim? I am dealing with
encrypted data and this modification which it is making to the xml
source is
significant?

I originally thought this might be a Ruby 1.9.2 issue but confirmed that
this is the same in 1.8.7. The difference is that REXML was returning
this
string as expected and now am converting to Nokogiri.

Thanks,

David

David K. wrote in post #958607:

Cant seem to find an answer to this on google:

If I have this value as the text within an attribute in my xml source:
“a2/PP00nFwWa7I8Jog7bcw==\n”

When I ask Nokogiri to return it, why does it return this:
"a2/PP00nFwWa7I8Jog7bcw== " (the last character I confirmed in the
debugger
as a space character). So it seems Nokogiri is converting the “\n” to a
space.

Is there a way to tell Nokogiri to return verbatim? I am dealing with
encrypted data and this modification which it is making to the xml
source is
significant?

You probably need to use the xml:space attribute in your source
document, or at least that’s the impression I get from
White Space | Microsoft Learn .

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

On Tue, Nov 2, 2010 at 3:22 PM, Marnen Laibow-Koser
[email protected]wrote:

space.

Is there a way to tell Nokogiri to return verbatim? I am dealing with
encrypted data and this modification which it is making to the xml
source is
significant?

You probably need to use the xml:space attribute in your source
document, or at least that’s the impression I get from
White Space | Microsoft Learn .

Thanks Marnen - that was a really good idea, I just tried it in the
console
and it does not seem to help for the “\n” (results below) is it possible
there is some other setting which would preserve the “\n”? This is
really
strange to me as these characters are within a string literal… but it
actually does also surprise me about the spaces.

without the xml:space=“preserve”

ruby > doc_enc = “<BORROWER
_SSN="a2/PP00nFwWa7I8Jog7bcw==\n">”
=> “<BORROWER _SSN="a2/PP00nFwWa7I8Jog7bcw==\n">”
ruby > Nokogiri::XML(doc_enc)
=> #<Nokogiri::XML::Document:0x12ff91c name=“document”
children=[#<Nokogiri::XML::Element:0x12ff71e name=“BORROWER”
attributes=[#<Nokogiri::XML::Attr:0x12ff6d8 name=“_SSN”
value=“a2/PP00nFwWa7I8Jog7bcw== “>]>]>
ruby > nd = Nokogiri::XML(doc_enc)
=> #<Nokogiri::XML::Document:0x12fded2 name=“document”
children=[#<Nokogiri::XML::Element:0x12fdcac name=“BORROWER”
attributes=[#<Nokogiri::XML::Attr:0x12fdc3e name=”_SSN”
value=“a2/PP00nFwWa7I8Jog7bcw== “>]>]>
ruby > nd.xpath(“BORROWER”).attribute(”_SSN”).value
=> "a2/PP00nFwWa7I8Jog7bcw== "

with xml:space="preserve"

ruby > doc_enc = “<BORROWER xml:space="preserve"
_SSN="a2/PP00nFwWa7I8Jog7bcw==
=> “<BORROWER xml:space="preserve"
_SSN="a2/PP00nFwWa7I8Jog7bcw==\n">”
ruby > Nokogiri::XML(doc_enc) => #<Nokogiri::XML::Document:0x12f7244
name=“document” children=[#<Nokogiri::XML::Element:0x12f708c
name=“BORROWER”
attributes=[#<Nokogiri::XML::Attr:0x12f705a name=“space”
namespace=#<Nokogiri::XML::Namespace:0x12f6f56 prefix=“xml” href=”
The "xml:" Namespace"> value=“preserve”>,
#<Nokogiri::XML::Attr:0x12f7050 name=“_SSN”
value=“a2/PP00nFwWa7I8Jog7bcw==
“>]>]>
ruby >
nd.xpath(“BORROWER”).attribute(”_SSN”).value
=>
"a2/PP00nFwWa7I8Jog7bcw== "
ruby >

On Tue, Nov 2, 2010 at 3:50 PM, David K.
[email protected]wrote:

When I ask Nokogiri to return it, why does it return this:
You probably need to use the xml:space attribute in your source

without the xml:space=“preserve”

children=[#<Nokogiri::XML::Element:0x12fdcac name=“BORROWER”
ruby > Nokogiri::XML(doc_enc) => #<Nokogiri::XML::Document:0x12f7244

What seems even more insane is that if I wrap the encrypted string in
characters (pipe in this case), it still takes away my “\n”:

=> “<BORROWER xml:space="preserve"
_SSN="|a2/PP00nFwWa7I8Jog7bcw==\n|">”
ruby-1.9.2-p0 >
Nokogiri::XML(doc_enc)
=> #<Nokogiri::XML::Document:0x12eb1d8 name=“document”
children=[#<Nokogiri::XML::Element:0x12eafd0 name=“BORROWER”
attributes=[#<Nokogiri::XML::Attr:0x12eaf94 name=“space”
namespace=#<Nokogiri::XML::Namespace:0x12eaea4 prefix=“xml” href="
The "xml:" Namespace"> value=“preserve”>,
#<Nokogiri::XML::Attr:0x12eaf8a
name=“_SSN” value=“|a2/PP00nFwWa7I8Jog7bcw== |”>]>]>

On Tue, Nov 2, 2010 at 3:59 PM, David K.
[email protected]wrote:

Cant seem to find an answer to this on google:
Is there a way to tell Nokogiri to return verbatim? I am dealing with
console and it does not seem to help for the “\n” (results below) is it
children=[#<Nokogiri::XML::Element:0x12ff71e name=“BORROWER”

with xml:space="preserve"

">]>]>
=> "<BORROWER xml:space="preserve"

Sorry for all the addl posts but also in CDATA!!! Can the chars “\n”
never mean anything but newline in our world?

=> “<BORROWER _SSN="[CDATA[a2/PP00nFwWa7I8Jog7bcw==\n]]">”
ruby-1.9.2-p0 >
Nokogiri::XML(doc_enc)
=> #<Nokogiri::XML::Document:0x12e6f16 name=“document”
children=[#<Nokogiri::XML::Element:0x12e6d04 name=“BORROWER”
attributes=[#<Nokogiri::XML::Attr:0x12e6cd2 name=“_SSN”
value=“[CDATA[a2/PP00nFwWa7I8Jog7bcw== ]]”>]>]>

On Nov 2, 6:03pm, David K. [email protected] wrote:

“a2/PP00nFwWa7I8Jog7bcw==\n”
significant?

=> #<Nokogiri::XML::Document:0x12fded2 name=“document”
_SSN="a2/PP00nFwWa7I8Jog7bcw==\n">"
ruby >
attributes=[#<Nokogiri::XML::Attr:0x12eaf94 name=“space”
namespace=#<Nokogiri::XML::Namespace:0x12eaea4 prefix=“xml” href="
The "xml:" Namespace"> value=“preserve”>,
#<Nokogiri::XML::Attr:0x12eaf8a
name=“_SSN” value=“|a2/PP00nFwWa7I8Jog7bcw== |”>]>]>

Sorry for all the addl posts but also in CDATA!!! Can the chars “\n”
never mean anything but newline in our world?

Take a deep breath. I believe you are quoting your strings
incorrectly. Witness the following script, which behaves as expected
on Nokogiri 1.4.3.1 and libxml 2.7.6:

require 'rubygems'
require 'nokogiri'

xml = '<root><foo _SSN="a2/PP00nFwWa7I8Jog7bcw==\n">bar</foo></

root>’

puts Nokogiri::XML.parse(xml).to_xml
# => <?xml version="1.0"?>
#    <root>
#      <foo _SSN="a2/PP00nFwWa7I8Jog7bcw==\n">bar</foo>
#    </root>

Next time you may want to try the nokogiri-talk mailing list for a
quicker response from users of the library.

On Wed, Nov 3, 2010 at 7:27 AM, Mike D.
[email protected]wrote:

puts Nokogiri::XML.parse(xml).to_xml

=> <?xml version="1.0"?>

bar

Next time you may want to try the nokogiri-talk mailing list for a
quicker response from users of the library.

Thanks Mike, this does work when I try it in the console and you are
right,
has to do with quoting. What seems clear is that the entire xml has to
be
within single quotes, as if it is within double quotes then the \n gets
replaced. What I am not clear about is how to tell Ruby when I load the
file
(I am getting the xml out of a saved file), to put it in single rather
than
double quotes? Or is there a way to transform it after loading it. When
I am
reading the file in I get:

file = “<BORROWER _SSN="a2/PP00nFwWa7I8Jog7bcw==\n">”