Newbie Question: delete all non alphanumeric characters

Hi all,
how can i delete all non alphanumeric characters in a string ? thanks

On Jul 21, 2006, at 1:53 PM, Theallnighter T. wrote:

Hi all,
how can i delete all non alphanumeric characters in a string ? thanks


Posted via http://www.ruby-forum.com/.

string.gsub(/[0-9a-z]+/i, ‘’)

Logan C. wrote:

string.gsub(/[0-9a-z]+/i, ‘’)

That deletes all alphanumeric. To delete all non-alphanumeric:

string.gsub(/[^0-9a-z]/i, ‘’)

On Jul 21, 2006, at 2:05 PM, Tom W. wrote:

Tom W.
Helmets to Hardhats
Software Developer
[email protected]
www.helmetstohardhats.org

Doh! I’m obviously not awake yet this —err-- afternoon.

On Jul 21, 2006, at 3:40 PM, Jim C. wrote:

output:

There are 2007 beans and 15234 grains of rice in this bag.
Thereare2007beansand15234grainsofriceinthisbag

Well the only “problem” with that is

x = ‘\w includes_under_scores_too’

I think \W is non-perl-word, so underscores won’t be stripped. If you
want
those out too:

irb(main):006:0> str = “The $re34& __q!?”
=> “The $re34& __q!?”
irb(main):007:0> str.gsub( /\W/, ‘’)
=> “There34__q”
irb(main):008:0> str.gsub( /\W|_/, ‘’)
=> “There34q”
irb(main):009:0>

Jeff

On 2006-07-21, Theallnighter T. [email protected]
wrote:

Hi all,
how can i delete all non alphanumeric characters in a string ? thanks

I’ve also just started to learn Ruby, so thought I’d reply for the
practice -
Here’s one solution:


#!/usr/bin/ruby

x = “There are 2007 beans and 15234 grains of rice in this bag.”
puts x
x.gsub!(/\W/, ‘’)
puts x


output:

There are 2007 beans and 15234 grains of rice in this bag.
Thereare2007beansand15234grainsofriceinthisbag

On 2006-07-21, Logan C. [email protected] wrote:

x = ‘\w includes_under_scores_too’

Woah! Thanks for pointing that out. It looks like
http://www.ruby-doc.org/docs/ruby-doc-bundle/UsersGuide/rg/regexp.html
has a bug:

\w letter or digit; same as [0-9A-Za-z]

It’s missing a _.

Here’s a fixed version:

#!/usr/bin/ruby

x = “There are 2007 beans_and 15234 grains of rice in this bag.”
puts x
x.gsub!(/\W/, ‘’)
puts x
x.gsub!(/\W|_/, ‘’)
puts “fixed:”
puts x

[email protected] wrote:

for fun, I started irb, then typed

“567576hgjhgjh&**)”.gsub(/^[0-9a-z]/i, ‘’)

It returned

67576hgjhgjh&**)

The carat goes inside the brackets (it inverses the character class)

Tom

for fun, I started irb, then typed

“567576hgjhgjh&**)”.gsub(/^[0-9a-z]/i, ‘’)

It returned

67576hgjhgjh&**)

for fun, I started irb, then typed

“567576hgjhgjh&**)”.gsub(/^[0-9a-z]/i, ‘’)

It returned

67576hgjhgjh&**)

No wonder. There was only one character at the begining of the
string…

Regards,
Rimantas

On 21-Jul-06, at 4:19 PM, Tom W. wrote:

The carat goes inside the brackets (it inverses the character class)

And it should look like this:

“567576hgjhgjh&**)”.sub(/[^0-9a-zA-Z]+/i, ‘’)

Note the +

Tom


Jeremy T.
[email protected]

“One serious obstacle to the adoption of good programming languages
is the notion that everything has to be sacrificed for speed. In
computer languages as in life, speed kills.” – Mike Vanier

On 2006-07-21, Jim C. [email protected]
wrote:

#!/usr/bin/ruby

puts “fixed:”
puts x

Oops - the above has a bug (although it still “works”). Here’s a fixed
version, with an opposite example further demonstrating the bug in the
ruby doc site:

#!/usr/bin/ruby

s = “There are 2007 beans_and 15234 grains of rice in this bag.”
x = s.dup
y = s.dup
puts “original:”
puts x
x.gsub!(/\W/, ‘’)
puts “\nbroken:”
puts x
y.gsub!(/\W|_/, ‘’)
puts “\nfixed:”
puts y

puts “\nopposite:”
z = s.dup
z.gsub!(/\w/, ‘’)
puts z

original:
There are 2007 beans_and 15234 grains of rice in this bag.

broken:
Thereare2007beans_and15234grainsofriceinthisbag

fixed:
Thereare2007beansand15234grainsofriceinthisbag

opposite:
.

Jeremy T. wrote:

And it should look like this:

“567576hgjhgjh&**)”.sub(/[^0-9a-zA-Z]+/i, ‘’)

Note the +

#sub only does one replacement; adding a + will replace one chunk of
non-alphas, but not any others in the string.

Tom

On 21-Jul-06, at 4:44 PM, Tom W. wrote:

of non-alphas, but not any others in the string.
typo, sorry.

Tom


Jeremy T.
[email protected]

“One serious obstacle to the adoption of good programming languages
is the notion that everything has to be sacrificed for speed. In
computer languages as in life, speed kills.” – Mike Vanier

On 7/21/06, Theallnighter T. [email protected] wrote:

Hi all,
how can i delete all non alphanumeric characters in a string ? thanks


Posted via http://www.ruby-forum.com/.

TMTOWTDI:

username.delete(‘^A-Za-z0-9’)

…I just thought I’d add a little variety to this collection of
Regexp-centric solutions.

On Jul 21, 2006, at 6:15 PM, Jeremy T. wrote:

#sub only does one replacement; adding a + will replace one chunk
of non-alphas, but not any others in the string.

typo, sorry.

Speaking of typos, say either a-zA-Z or a-z/i, you don’t need both