RegEx for punctuation?


#1

Hey all,
I have a quick question about RegEx, which I don’t use often and I
don’t have my reference book handy.
How do I gsub all non alpha-numeric characters to nothing,
I know it’s something like this:
some_string.gsub!(‘something here’, ‘’)

but I can’t for the life of me remember the RegEx for this.

thanks ahead of time

John J.


#2

John J. wrote:

Hey all,
I have a quick question about RegEx, which I don’t use often and I don’t
have my reference book handy.
How do I gsub all non alpha-numeric characters to nothing,
I know it’s something like this:
some_string.gsub!(‘something here’, ‘’)
irb(main):002:0> “,.'abc123”.gsub(/[^[:alnum:]]/, ‘’)
=> “abc123”


#3

On Aug 2, 2007, at 2:14 PM, Alex Y. wrote:


Alex

Thanks a million!
I’m going to have to do more RegEx stuff to get it buried into my brain!

One more question though, I previously had this to turn spaces into
underscores
some_string.gsub(’ ‘,’’)

How can do this while removing all non-space characters?

John J.


#4

On Aug 2, 2007, at 2:26 PM, John J. wrote:

irb(main):002:0> “,.'abc123”.gsub(/[^[:alnum:]]/, ‘’)
underscores
some_string.gsub(’ ‘,’’)

How can do this while removing all non-space characters?

John J.

Found it. In Peter C.'s book! (I knew I bought that ebook for a
reason)
this is my final version:

 @asset.permalink = @asset.name.downcase.gsub(' ', '_').gsub(/

\W/,’’)

downcase everything, turn spaces into underscores, then take all non-
alpha/non-numeric/non-underscore characters out.
sweet and simple.
For anyone else looking for it in the archives,
\w matches all alpha/numeric/underscore characters
\W matches everything else.

I could go a little further in the case of something like

title = ‘Untitled #234

and do this:

title = title.downcase.gsub(’ ‘, ‘_’).gsub(’#’, ‘number’).gsub(/\W/, ‘’)

resulting in:
untitled_number234

Nice and tidy. Could be a bit further refined to make sure a space is
prepended to ‘number’ if no space precedes ‘#’ and to make sure a
number actually follows ‘#’ in before going to all the trouble.
If I wanted hard to read, I could even try to squeeze it all into one
big RegEx, but the thing I’ve got above will suffice for now.

cheers,
John J.