More Regexp and file load problems


#1

Hi All,

I am struggling here and would be most appreciative of any help.

I have a regular expression problem I can’t seem to figure my way out
of. I had sent a similar request this morning, and gratefully
received two emails from David Black and Axel E… Unfortunately,
I am still having difficulty.

I am trying to extract the font name attribute out of the following
xml (this is an excerpt of a larger file). The weird thing, is that
my regular expression correctly matches the text if I use TextMate’s
regular expression ‘find in project feature’ and with the freeware
Reggy regular expression test tool found on Google code. It also
matches correctly, if I text against a subset of the xml being
parsed. I.E., see this first example.

#!/usr/bin/env ruby

Created by Don L. on 2007-06-26.

Copyright © 2007. All rights reserved.

string = %q(



Helvetica
9
256
#000000
")

regexp = Regexp.new(/^\s*<Font-family codeSet="\w*" fontId="\d*">
(\w*)</Font-family>\s*$/)

if string =~ regexp
puts “#{$1}”

end

Result: Helvetica

However, if I run this script, there is no result.

#!/usr/bin/env ruby

Created by Don L. on 2007-06-26.

Copyright © 2007. All rights reserved.

regexp = Regexp.new(/^\s*<Font-family codeSet="\w*" fontId="\d*">
(\w*)</Font-family>\s*$/)
file = File.new(’/Users/donlevan/Desktop/DDRs/Apple Dealer Price
List.xml’)

file.each do |line|
if line =~ regexp
puts “#{$1}”

end
end

I have looked at Hiproct and XML simple, unfortunately I can not get
TextMate configured correctly so I keep getting Loaderrors when I try
to use ruby gems.

Thanks,

Don


#2

From: Don L. [mailto:removed_email_address@domain.invalid] :

file = File.new(’/Users/donlevan/Desktop/DDRs/Apple Dealer Price

List.xml’)

file.each do |line|

if line =~ regexp

since your regex passes the string test, i would suspect the contents of
the file itself. possibly the expression is broken into multiple lines…

puts “#{$1}”

end

end

try grepping the file first

root@pc4all:~# grep -i Helvetica test.txt
Helvetica

then pass the result to simplified ruby program

root@pc4all:~# cat test.rb
regexp = Regexp.new(/^\s*<Font-family codeSet="\w*"
fontId="\d*">(\w*)</Font-family>\s*$/)
ARGF.each do |line|
if line =~ regexp
puts “#{$1}”

end
end

root@pc4all:~# grep -i Helvetica test.txt | ruby test.rb
Helvetica


#3

On Jun 27, 9:44 pm, Don L. removed_email_address@domain.invalid wrote:

xml (this is an excerpt of a larger file). The weird thing, is that

However, if I run this script, there is no result.

Thanks,

Don

You use an xml library, possibly rexml
don’t forget require ‘rubygems’