Strip illegal characters from a file name

Hi,

I have a program which takes a users details and saves them in a pdf.
The name of the pdf file should consist of the users id, an underscore
and the users surname, e.g. 01_Smith.pdf

Thing is, how do I check for illegal file name characters in the
surname?
On Windows for example, the characters /:*?"<>| are not allowed in a
file name, so obviously they shouldn’t occur in the surname.

As there seem to be multiple combinations of illegal file name
characters on different operating systems, I thought about stripping
everything except letters from the surname and came up with:

surname.gsub(/[^a-zA-Z]/,"")

But as we have users from around the world, it is common for them to
have special characters in their surname (e.g. äöüß in Germany). The
above reg exp strips these out too, which isn’t perfect.

Is there a reg exp which would strip out everything but letters and
leave special characters in tact or am I going about this the wrong
way??

Any tips would be greatly appreciated.

I’ve got a script that strips out invalid characters from TextMate
bundles so that they can be used on Windows. It has worked in all
cases so far and is reversible because it uses URI::escape to encode
the characters. It might be helpful as a starting point for you.

Charles

Charles R. wrote:

I’ve got a script that strips out invalid characters from TextMate
bundles so that they can be used on Windows. It has worked in all
cases so far and is reversible because it uses URI::escape to encode
the characters. It might be helpful as a starting point for you.

A script to fix filenames that are incompatible with Windows. · GitHub

Charles

Hey that’s really cool.
Thanks very much for püointing me in the right direction.