Simple regex question

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the “root” of the filename, without the underscore or
the numbers.
Dir.chdir(“L:/infocontiffs/ehs-g7917741”)
files = Dir.glob(".tiff")
file = files[0]
puts file
file = file.gsub(/^(.
)_[0-9]+.tiff/, “#{$1}”)
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn’t it give me my root filename?
Thanks,
Peter

Peter B. wrote:

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the “root” of the filename, without the underscore or
the numbers.
Dir.chdir(“L:/infocontiffs/ehs-g7917741”)
files = Dir.glob(".tiff")
file = files[0]
puts file
file = file.gsub(/^(.
)_[0-9]+.tiff/, “#{$1}”)
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn’t it give me my root filename?
Thanks,
Peter

Is this what you want?

while fname = DATA.gets
m = fname.match /(.*?)_\d+.tiff/
if m
puts “Match: ‘#{m[1]}’”
else
puts “No match: #{fname}”
end
end

END
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff

Peter B. wrote:

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the “root” of the filename, without the underscore or
the numbers.
Dir.chdir(“L:/infocontiffs/ehs-g7917741”)
files = Dir.glob(".tiff")
file = files[0]
puts file
file = file.gsub(/^(.
)_[0-9]+.tiff/, “#{$1}”)

The argument “#{$1}” is expanded once, before gsub even executes. You
probably want the block form:

file = file.sub(/^(.*)_\d+.tiff/) { $1 }

Hi –

On Fri, 26 Jun 2009, Peter B. wrote:

puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn’t it give me my root filename?

Here’s another good use of the string[//] technique:

file = “ehs-g7917741_01.tiff”
=> “ehs-g7917741_01.tiff”

file[/[^_]+/] # match non-underscore characters
=> “ehs-g7917741”

David

Beautiful. Thanks.

2009/6/26 David A. Black [email protected]:

On Fri, 26 Jun 2009, Peter B. wrote:

Here’s another good use of the string[//] technique:

file = “ehs-g7917741_01.tiff”

=> “ehs-g7917741_01.tiff”

file[/[^_]+/] # match non-underscore characters

=> “ehs-g7917741”

Combining all the good suggestions this is probably what I’d do:

files = Dir.glob(“L:/infocontiffs/ehs-g7917741/*.tiff”)
files.each do |f|
base = File.basename f
root = base[/^([^]+)\d+.tiff$/, 1]

if base

rename or whatever

else
$stderr.puts “Dunno what to do with #{f}”
end
end

The reason I left in the matching of underscores and digits is to be
sure that the complete name matches the pattern that we required in
order to detect other files that might accidentally have been placed
in that directory.

Kind regards

robert

Tim H. wrote:

Peter B. wrote:

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the “root” of the filename, without the underscore or
the numbers.
Dir.chdir(“L:/infocontiffs/ehs-g7917741”)
files = Dir.glob(".tiff")
file = files[0]
puts file
file = file.gsub(/^(.
)_[0-9]+.tiff/, “#{$1}”)
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn’t it give me my root filename?
Thanks,
Peter

Is this what you want?

while fname = DATA.gets
m = fname.match /(.*?)_\d+.tiff/
if m
puts “Match: ‘#{m[1]}’”
else
puts “No match: #{fname}”
end
end

END
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff

Well, you gave me a good idea, using match. Here’s what I did, and, it
worked. Thank you very much, Tim.

Dir.chdir(“L:/infocontiffs/ehs-g7917741”)
files = Dir.glob(".tiff")
file = files[0]
puts file
file = file.match(/^(.
)_[0-9]+.tiff/)
#file = file.to_i
puts $1
#end
gives me:
ehs-g7917741_01.tiff
ehs-g7917741

    Program exited with code 0