[Total novice]
A follow-up on my last email (“search and replace”)". I am trying to
convert an OOo xml source (content.xml) to TeX. It’s a bibliography and
thus very predictable/regular/simple etc. Each entry looks roughly like
this (simplified):
====================================
<text:p text:style-name=“ID”>[<text:sequence text:ref-name=“refAutoNr3”
text:name="AutoNr" text:formula="ooow:AutoNr+1"
style:num-format=“1”>4</text:sequence></text:p>
<text:p text:style-name=“Standard”>Ben</text:p>
<text:p text:style-name=“reference”>
<text:span text:style-name=“T10”>Article</text:span>.,
<text:span text:style-name=“Style2”>Journal</text:span>,
volume, issue, year.
</text:p>
<text:p text:style-name=“reference”/>
<text:p text:style-name=“reference”/>
I. line one is discussed in my last email. Basically, each line of this
type (numbers are variable) needs to be converted to
====
\head
II.
<text:p text:style-name=“P6”>Jim</text:p>
<text:p text:style-name=“P8”>Michael</text:p>
<text:p text:style-name=“Standard”>Ben</text:p>
replace each with the name plus a linespace
====================================
Jim
Michael
Ben
III. <text:span text:style-name=“T10”>Article</text:span>
If the style-name=“T10”, then the argument should be, e.g. {\bf
Article}
if the style-name=“Style2”, then argument should be, e.g. {\it
Journal}
IV. So the final output should be something like
====================================
\head Ben
{\bf Article}, {\it Journal}, volume, issue, year.
====================================
I hope to get enough info here to be able to finish this myself. I
assume finishing my script would only take one of you guys 15 or 20
minutes If I’m not able to get things working quickly (trying to
learn Ruby and do my work at the same time) I will be happy to pay one
of you for an hour or so of work (I’m up against a deadline).
THANK YOU
Idris
PS For reference, here is the script I’m trying to modify for this OOo
bibliography:
=====================================
class OpenOffice
# using an xml parser if overkill and we need to regexp anyway
attr_reader :display, :inline, :translate
attr_writer :display, :inline, :translate
def initialize
@data = nil
@file = ''
@display = Hash.new
@inline = Hash.new
@translate = Hash.new
end
def load(filename)
if not filename.empty? and FileTest.file?(filename) then
begin
@data, @file = IO.read(filename), filename
rescue
@data, @file = nil, ''
end
else
@data, @file = nil, ''
end
end
def save(filename='')
if filename.empty? then
filename = "clean-#{@file}"
end
if f = open(filename,'w') then
f.puts(@data)
f.close
end
end
def convert
@translations = Hash.new
@translate.each do |k,v|
@translations[/#{k}/] = v
end
if @data then
@data.gsub!(/<\?.*?\?>/) do
# remove
end
@data.gsub!(/<!--.*?-->/) do
# remove
@data.gsub!(/<!--.*?-->/) do
# remove
end
@data.gsub!(/.*?<(office:text).*?>(.*?)<\/\1>.*/mois) do
'\starttext' + "\n" + $2 + "\n" + '\stoptext'
end
@data.gsub!(/<(office:font-face-decls|office:automatic-styles|text:sequence-decls).?>.?</\1>/mois)
do
# remove
end
@data.gsub!(/text:span.*?text:style-name=([\’\"])(.*?)\1(.?)</text:span>/)
do
tag, text = $2, $3
if inline[tag] then
(inline[tag][0]||’’) + clean_display(text) +
(inline[tag][1]||’’)
else
clean_display(text)
end
end
@data.gsub!(/text:p[^]?/>/) do
# remove
end
@data.gsub!(/text:p.*?text:style-name=([\’\"])(.*?)\1(.*?)</text:p>/)
do
tag, text = $2, $3
if display[tag] then
“\n” + (display[tag][0]||’’) + clean_inline(text) +
(display[tag][1]||’’) + “\n”
else
“\n” + clean_inline(text) + “\n”
end
end
@data.gsub!(/\t/,’ ‘)
@data.gsub!(/^ +$/,’’)
@data.gsub!(/\n\n+/moi,"\n\n")
end
end
def clean_display(str)
str.gsub!(/"(.*?)"/) do
'\quotation {' + $1 + '}'
end
str
end
def clean_inline(str)
@translations.each do |k,v|
str.gsub!(k,v)
end
str
end
end
def convert(filename)
doc = OpenOffice.new
doc.display['P1'] = ['\chapter{','}']
doc.display['P2'] = ['\startparagraph'+"\n","\n"+'\stopparagraph']
doc.display['P3'] = doc.display['P2']
doc.inline['T1'] = ['','']
doc.inline['T2'] = ['{\sl ','}']
doc.translate['¬'] = 'XX'
doc.translate['''] = '`'
doc.load(filename)
doc.convert
doc.save
end
filename = ARGV[0]
filename = ‘content.xml’ if not filename or filename.empty?