How can I search value from xml

Artit_S · February 20, 2006, 1:10pm

How can i search value from xml file such as I want to find from
*pubdate and
return **biblioentry
Please give me some source code for further study
**

<?xml version="1.0" encoding="ISO-8859-15"?> Godfrey Vesey Personal Identity: A Philosophical Analysis Cornell University Press 1977 Geoffrey Madell The Identity of the Self Edinburgh University Press 1981 Sydney Shoemaker Richard Swinburne Personal Identity Basil Blackwell 1984 Jonathan Glover The Philosophy and Psychology of Personal Identity Penguin 1988 Harold W. Noonan Personal Identity Routledge 1989 Ren Marres Persoonlijke identiteit na het verval van de ziel Coutinho 1991 James Baillie Problems in Personal Identity Paragon House 1993 Brian Garrett Personal Identity and Self-Consciousness Routledge 1998 John Perry Identity, Personal Identity, and the Self Hackett 2002

*Thank You

Artit S.

http://www.rubybox.net (Thai Language)

Artit_S · February 20, 2006, 1:19pm

Artit S. wrote:

How can i search value from xml file such as I want to find from
*pubdate and return **biblioentry

http://www.germane-software.com/software/rexml/

robert

Artit_S · February 20, 2006, 3:14pm

You’ll propably want to use REXML and XPath:

require ‘rexml/document’
require ‘rexml/xpath’

include REXML

bibliography = Document.new( ARGV[0] )

XPath.each( bibliography, “/biblioentry[pubdate > 1993]”) do
|biblioentry|

do something with biblioentry here

end

Not entirely sure if that works, as the PC I’m on doesnt have Ruby
installed

Scott

Artit_S · February 20, 2006, 5:19pm

Artit S. wrote:

    <firstname>Godfrey</firstname>
    <firstname>Geoffrey</firstname>
    <firstname>Sydney</firstname>
  <pubdate>1984</pubdate>
  <pubdate>1988</pubdate>
</biblioentry>

class String
def xtag(s)
scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
end
end

gets(nil).xtag(“biblioentry”).each { |tag,data|
if data.xtag(“pubdate”)[0][1] > “1984”
print tag, data, “\n”
end
}

Artit_S · February 20, 2006, 6:14pm

Christian N. wrote:

print tag, data, "\n"
end
}
I hope you are joking…

I hope you’re joking.

Artit_S · February 20, 2006, 5:53pm

“William J.” [email protected] writes:

}
I hope you are joking…

Artit_S · February 21, 2006, 12:06am

Artit S. wrote:

    <firstname>Godfrey</firstname>
    <surname>Vesey</surname>
  </author>
  <title>Personal Identity: A Philosophical Analysis</title>
  <publisher>
    <publishername>Cornell University Press</publishername>
  </publisher>
  <pubdate>1977</pubdate>

class String
def xtag(s)
scan( %r! < #{s} (?: \s+ ( [^>]* ) )? >
( .? ) </ #{s} > !mx ).
map{ |attr, data| h = { }
if attr
attr.scan( %r! ( \S+ ) = " ( [^"] ) " !x ){ |k,v|
h[k] = v }
end
[ h, data ]
}
end
end

gets(nil).xtag(“biblioentry”).each { |attr,data|
if data.xtag(“pubdate”)[0][1] > “1984”
print attr[“id”], data, “\n”
end
}

Artit_S · February 21, 2006, 12:41am

William J. wrote:

class String
def xtag(s)

end

gets(nil).xtag(“biblioentry”).each { |attr,data|

}

Please stop. To the OP, use rexml.

Artit_S · February 21, 2006, 8:10am

Artit S. wrote:

    <firstname>Godfrey</firstname>
    <surname>Vesey</surname>
  </author>
  <title>Personal Identity: A Philosophical Analysis</title>
  <publisher>
    <publishername>Cornell University Press</publishername>
  </publisher>
  <pubdate>1977</pubdate>

class String
def xtag(s)
scan( %r!
< #{s} (?: \s+ ( [^>]* ) )? / >
|
< #{s} (?: \s+ ( [^>]* ) )? >
( .? ) </ #{s} >
!mx ).
map{ |unpaired, attr, data| h = { }
attr = ( unpaired || attr )
if attr
attr.scan( %r! ( \S+ ) = " ( [^"] ) " !x ){ |k,v|
h[k] = v }
end
[ h, data ]
}
end
def xshow( depth=0 )
text = “”
split( /<([^>]*)>/ ).each_with_index{ |s,i|
if 0 == i % 2
text = s.strip
else
indent = " " * ( depth * 2 )
case
when s[0,1] == “/”
depth -= 1
puts text.map{|x| indent + x.strip } if text != “”
when s[-1,1] == “/”
puts indent + s
else
puts indent + s
depth += 1
end
end
}
end
end

gets(nil).xtag(“biblioentry”).each { |attr,data|
if data.xtag(“pubdate”)[0][1] > “1997”
puts attr[“id”]
data.xshow( 1 )
end
}

Output:

FHIW13C-1298-4
author
firstname
Brian
surname
Garrett
title
Personal Identity and Self-Consciousness
publisher
publishername
Routledge
pubdate
1998
FHIW13CX-1202-1
author
firstname
John
surname
Perry
title
Identity, Personal Identity, and the Self
publisher
publishername
Hackett
pubdate
2002

Artit_S · February 21, 2006, 12:06am

Artit S. wrote:

    <firstname>Godfrey</firstname>
    <surname>Vesey</surname>
  </author>
  <title>Personal Identity: A Philosophical Analysis</title>
  <publisher>
    <publishername>Cornell University Press</publishername>
  </publisher>
  <pubdate>1977</pubdate>

class String
def xtag(s)
scan( %r! < #{s} (?: \s+ ( [^>]* ) )? >
( .? ) </ #{s} > !mx ).
map{ |attr, data| h = { }
if attr
attr.scan( %r! ( \S+ ) = " ( [^"] ) " !x ){ |k,v|
h[k] = v }
end
[ h, data ]
}
end
end

gets(nil).xtag(“biblioentry”).each { |attr,data|
if data.xtag(“pubdate”)[0][1] > “1984”
print attr[“id”], data, “\n”
end
}

Artit_S · February 21, 2006, 8:19am

I want to use rexml or any library ,please
thankz

On 2/20/06, Artit S. [email protected] wrote:

<biblioentry id="FHIW13C-1234">
Basil Blackwell Penguin http://www.rubybox.net (Thai Language)

–
Artit S.

http://www.rubybox.net (Thai Language)

Artit_S · February 21, 2006, 12:06pm

On 2/21/06, Ross B. [email protected] wrote:

(Also, REXML does support XPath, so you should be able to modify the
above to work with that. Just to be sure, I tried it 100 times over:

XPath
                      user     system      total        real
rexml 9.840000 0.080000 9.920000 ( 10.046963)
libxml2 0.090000 0.000000 0.090000 ( 0.139592)

Every time I’ve tried to use REXML for something I’ve found it to be
incredibly slow and painful on large files. Usually I start with
REXML, get annoyed, and then install QuiXML
(http://quixml.rubyforge.org/). Though it doesn’t have bells and
whistles, it’s a heck of a lot faster. Anyhow, I’m certainly looking
forward to your libxml2 bindings!

-Pawel

Artit_S · February 21, 2006, 11:39am

On Tue, 2006-02-21 at 16:17 +0900, Artit S. wrote:

I want to use rexml or any library ,please
thankz

As others say, for now REXML is probably the way to go, but very soon
now you’ll be able to use Libxml2 also if things keep going to plan over
here.

require 'xml/libxml'

d = XML::Parser.file('test.xml').parse
p d.find('//biblioentry[pubdate = 1977]').to_a

If you want to try it before we get to release go to CVS:
http://rubyforge.org/scm/?group_id=494

(Also, REXML does support XPath, so you should be able to modify the
above to work with that. Just to be sure, I tried it 100 times over:

XPath

                      user     system      total        real

rexml 9.840000 0.080000 9.920000 ( 10.046963)
libxml2 0.090000 0.000000 0.090000 ( 0.139592)

Artit_S · February 23, 2006, 4:30am

Christian N. wrote:

print tag, data, "\n"
end
}
I hope you are joking…

Actually, in real-world usage, Mark Pilgrim’s Python Feed Parser[0]
falls back to regular expressions to get the data required if the XML is
not well-formed.

Admittedly this is a real problem for RSS hackers, less so with other
XML messages, but the approach does have merit if (a) you can’t
guarantee well-formedness and (b) you absolutely have to have the data.

-dave

[0] http://feedparser.org/

Artit_S · February 21, 2006, 4:42pm

“William J.” [email protected] writes:

  <author>
class String
}
Still doesn’t support namespaces, entities and CDATA…
(Or nested tags like

.)