More search and replace

[Total novice]

A follow-up on my last email (“search and replace”)". I am trying to
convert an OOo xml source (content.xml) to TeX. It’s a bibliography and
thus very predictable/regular/simple etc. Each entry looks roughly like
this (simplified):

====================================
<text:p text:style-name=“ID”>[<text:sequence text:ref-name=“refAutoNr3”

text:name="AutoNr" text:formula="ooow:AutoNr+1"

style:num-format=“1”>4</text:sequence></text:p>
<text:p text:style-name=“Standard”>Ben</text:p>
<text:p text:style-name=“reference”>
<text:span text:style-name=“T10”>Article</text:span>.,
<text:span text:style-name=“Style2”>Journal</text:span>,
volume, issue, year.
</text:p>
<text:p text:style-name=“reference”/>
<text:p text:style-name=“reference”/>

I. line one is discussed in my last email. Basically, each line of this
type (numbers are variable) needs to be converted to

====
\head

II.

<text:p text:style-name=“P6”>Jim</text:p>
<text:p text:style-name=“P8”>Michael</text:p>
<text:p text:style-name=“Standard”>Ben</text:p>

replace each with the name plus a linespace

====================================
Jim

Michael

Ben

III. <text:span text:style-name=“T10”>Article</text:span>

If the style-name=“T10”, then the argument should be, e.g. {\bf
Article}
if the style-name=“Style2”, then argument should be, e.g. {\it
Journal}

IV. So the final output should be something like

====================================
\head Ben

{\bf Article}, {\it Journal}, volume, issue, year.

====================================

I hope to get enough info here to be able to finish this myself. I
assume finishing my script would only take one of you guys 15 or 20
minutes :wink: If I’m not able to get things working quickly (trying to
learn Ruby and do my work at the same time) I will be happy to pay one
of you for an hour or so of work (I’m up against a deadline).

THANK YOU
Idris

PS For reference, here is the script I’m trying to modify for this OOo
bibliography:

=====================================
class OpenOffice

# using an xml parser if overkill and we need to regexp anyway

attr_reader :display, :inline, :translate
attr_writer :display, :inline, :translate

def initialize
    @data = nil
    @file = ''
    @display = Hash.new
    @inline = Hash.new
    @translate = Hash.new
end

def load(filename)
    if not filename.empty? and FileTest.file?(filename) then
        begin
            @data, @file = IO.read(filename), filename
        rescue
            @data, @file = nil, ''
        end
    else
        @data, @file = nil, ''
    end
end

def save(filename='')
    if filename.empty? then
        filename = "clean-#{@file}"
    end
    if f = open(filename,'w') then
        f.puts(@data)
        f.close
    end
end

def convert
    @translations = Hash.new
    @translate.each do |k,v|
        @translations[/#{k}/] = v
    end
    if @data then
        @data.gsub!(/<\?.*?\?>/) do
            # remove
        end
        @data.gsub!(/<!--.*?-->/) do
            # remove
        @data.gsub!(/<!--.*?-->/) do
            # remove
        end
        @data.gsub!(/.*?<(office:text).*?>(.*?)<\/\1>.*/mois) do
            '\starttext' + "\n" + $2 + "\n" + '\stoptext'
        end

@data.gsub!(/<(office:font-face-decls|office:automatic-styles|text:sequence-decls).?>.?</\1>/mois)
do
# remove
end

@data.gsub!(/text:span.*?text:style-name=([\’\"])(.*?)\1(.?)</text:span>/)
do
tag, text = $2, $3
if inline[tag] then
(inline[tag][0]||’’) + clean_display(text) +
(inline[tag][1]||’’)
else
clean_display(text)
end
end
@data.gsub!(/text:p[^]
?/>/) do
# remove
end

@data.gsub!(/text:p.*?text:style-name=([\’\"])(.*?)\1(.*?)</text:p>/)
do
tag, text = $2, $3
if display[tag] then
“\n” + (display[tag][0]||’’) + clean_inline(text) +
(display[tag][1]||’’) + “\n”
else
“\n” + clean_inline(text) + “\n”
end
end
@data.gsub!(/\t/,’ ‘)
@data.gsub!(/^ +$/,’’)
@data.gsub!(/\n\n+/moi,"\n\n")
end
end

def clean_display(str)
    str.gsub!(/&quot;(.*?)&quot;/) do
        '\quotation {' + $1 + '}'
    end
    str
end

def clean_inline(str)
    @translations.each do |k,v|
        str.gsub!(k,v)
    end
    str
end

end

def convert(filename)

doc = OpenOffice.new

doc.display['P1'] = ['\chapter{','}']
doc.display['P2'] = ['\startparagraph'+"\n","\n"+'\stopparagraph']
doc.display['P3'] = doc.display['P2']

doc.inline['T1'] = ['','']
doc.inline['T2'] = ['{\sl ','}']

doc.translate['¬'] = 'XX'
doc.translate['&apos;'] = '`'

doc.load(filename)

doc.convert

doc.save

end

filename = ARGV[0]

filename = ‘content.xml’ if not filename or filename.empty?

convert(‘content.xml’)

Are you using OOo 2.0.4? I know it has a TeX/BibTeX export feature
now…

It’s not Ruby, but it should work (unless you’re using this with some
sort of automated system). :slight_smile:

–Jeremy

Hi Jeremy,

On Dec 2, 11:57 am, “Jeremy McAnally” [email protected]
wrote:

Are you using OOo 2.0.4? I know it has a TeX/BibTeX export feature now…

Wow, I did not know this, but…

It’s not Ruby, but it should work (unless you’re using this with some
sort of automated system). :slight_smile:

I use ConTeXt, not LaTeX, and the two are really different, so…

I am sending a note to the ConTeXt developers list about this; maybe
some of them can port the OOo LaTeX filters to ConTeXt. In the meantime
I think it’s best to finish that script…

Thank you very much for letting me know about OOo and LaTeX!

Best
Idris

On Dec 2, 12:21 pm, “ishamid” [email protected] wrote:

Hi Jeremy,

On Dec 2, 11:57 am, “Jeremy McAnally” [email protected]
wrote:

Are you using OOo 2.0.4? I know it has a TeX/BibTeX export feature now…Wow, I did not know this, but…

It’s not Ruby, but it should work (unless you’re using this with some
sort of automated system). :slight_smile:

I checked it out; the source is way too messy for my purposes; it will
be much easier to convert the xml to ConTeXt than the LaTeX to ConTeXt.

Thnx again
Idris