Regex for splitting filenames


#1

Hello all,

I want to split a filename into it’s root and extension, ie:
someFileName.txt = ‘someFileName’ and ‘txt’

This is simple enough with string.split(), but what if the file has more
than
one period in it? I have worked around this by doing:

filename = “some.file.name.txt”
temp = filename.split(".")
type = temp.pop
fileroot = temp.join(".")

I was wondering though if instead of the temp variable I could just do:

fileroot, type = filename.split(/regex here?/)

I cannot find a way to write a regex that only matches the last period
in the
filename. Is there an elegant way to do this?

Thanks,
-d


#2

On Wed, Jun 07, 2006 at 10:30:23AM +0900, darren kirby wrote:

type = temp.pop
-d

darren kirby :: Part of the problem since 1976 :: http://badcomputer.org
“…the number of UNIX installations has grown to 10, with more expected…”

  • Dennis Ritchie and Ken Thompson, June 1972

base = ("#{filename}"[0…("#{filename}".rindex(’.’)-1)])


#3

On 6/6/06, darren kirby removed_email_address@domain.invalid wrote:

fileroot, type = filename.split(/regex here?/)

I cannot find a way to write a regex that only matches the last period in the
filename. Is there an elegant way to do this?

ext = File.extname(filename)
file = File.basename(filename, ext)

-austin


#4

On Wed, 7 Jun 2006, darren kirby wrote:

type = temp.pop
-d
jib:~ > ruby -e’ p “bar.txt”.split( %r/.([^.]+)$/ ) ’
[“bar”, “txt”]

jib:~ > ruby -e’ p “/foo/bar.txt”.split( %r/.([^.]+)$/ ) ’
["/foo/bar", “txt”]

-a


#5

On Jun 7, 2006, at 2:54, Austin Z. wrote:

On 6/6/06, darren kirby removed_email_address@domain.invalid wrote:

fileroot, type = filename.split(/regex here?/)

I cannot find a way to write a regex that only matches the last
period in the
filename. Is there an elegant way to do this?

ext = File.extname(filename)
file = File.basename(filename, ext)

Equivalent, but I like the semantics a little better

require ‘pathname’
f = Pathname.new(filename)
f.basename
f.extname

pathname is very handy - see the standard library docs: http://ruby-
doc.org/stdlib/libdoc/pathname/rdoc/index.html


#6

I was wondering though if instead of the temp variable I
could just do:

fileroot, type = filename.split(/regex here?/)

I cannot find a way to write a regex that only matches the
last period in the
filename. Is there an elegant way to do this?

Not sure if it can be considered elegant, or if it’ll work in all
situations, but:

fileroot, type = /^([^.]$|.(?=.)).?(.*)$/.match(filename)[1…2]

Seems to work for the tests I can think of:

fdasfa => [“fdasfa”, “”]
.fdsafds => ["", “fdsafds”]
dda.dfasd => [“dda”, “dfasd”]
fdsafd.fdsdaf.fdsaf => [“fdsafd.fdsdaf”, “fdsaf”]
fdasfas. => [“fdasfas”, “”]

Alternatively, the following might be easier to read:

fileroot, type = (/(.)(…)/.match(filename)||[nil,filename])[1…2]


#7

quoth the Austin Z.:

On 6/6/06, darren kirby removed_email_address@domain.invalid wrote:

fileroot, type = filename.split(/regex here?/)

I cannot find a way to write a regex that only matches the last period in
the filename. Is there an elegant way to do this?

ext = File.extname(filename)
file = File.basename(filename, ext)

-austin

Thanks. I should have guessed there would be a builtin for it…

-d


#8

On 6/6/06, Matthew S. removed_email_address@domain.invalid wrote:

Equivalent, but I like the semantics a little better

require ‘pathname’
f = Pathname.new(filename)
f.basename
f.extname

pathname is very handy - see the standard library docs: http://ruby-
doc.org/stdlib/libdoc/pathname/rdoc/index.html

I think you mean f.basename(f.extname) for the basename without the
extensions.

-austin


#9

If you want to use regex, try this;

str = “abc.def.ghi.jkl.mno”
root, ext = /^../.match(str).to_s.chop, /^../.match(str).post_match

Harry