Finding non-printable characters using Regular Expressions

As part of a method I am playing with while learning Ruby I need to be
able to determine which characters in a string are non-printable. What
is the “best” method for determining if a character is printable, such
as an “A”, or unprintable, such as a tab?
While I could create a list of printable characters using ranges is this
the best way to do this?

Michael W. Ryder wrote:

As part of a method I am playing with while learning Ruby I need to be
able to determine which characters in a string are non-printable. What
is the “best” method for determining if a character is printable, such
as an “A”, or unprintable, such as a tab?
While I could create a list of printable characters using ranges is this
the best way to do this?

The POSIX character classes are for exactly this:

irb(main):001:0> “A \n B \t C”.gsub(/[[:graph:]]/, ‘’)
=> " \n \t "
irb(main):002:0> “A \n B \t C”.gsub(/[[:print:]]/, ‘’)
=> “\n\t”

Michael W. Ryder wrote:

Thank you for your assistance, it has given me a starting point and I
will have to spend some time experimenting and researching to reach the
final step.

You’re nearly there. Look a little closer at my suggestion,
particularly the second regex.

Alex Y. wrote:

irb(main):001:0> “A \n B \t C”.gsub(/[[:graph:]]/, ‘’)
=> " \n \t "
irb(main):002:0> “A \n B \t C”.gsub(/[[:print:]]/, ‘’)
=> “\n\t”

This is very close to what I am looking for. If I use
“A \n B \t C”.gsub(/[^[:graph:]]/, ‘’)
it returns “ABC”, but I need to keep the spaces and have not been able
to figure out how to include them in the output so that it shows “A B
C”.
Thank you for your assistance, it has given me a starting point and I
will have to spend some time experimenting and researching to reach the
final step.

Michael W. Ryder wrote:

“A \n B \t C”.gsub(/[^[:graph:]]/, ‘’)

I need to keep the spaces and have not been able to figure
out how to include them in the output so that it shows “A B C”.

Hint: examine the second parameter of String#gsub

THE book on RegEx is “Mastering Regular Expressions” from OReilly.
It is a bit Perl focused in the examples, but the book itself is all
about regular expressions in use.

John J. wrote:

THE book on RegEx is “Mastering Regular Expressions” from OReilly.
It is a bit Perl focused in the examples, but the book itself is all
about regular expressions in use.

I will get a copy of the book as trying to find the information on the
web is very time consuming and hit or miss. Thank you for the
suggestion.

Alex Y. wrote:

The POSIX character classes are for exactly this:
to figure out how to include them in the output so that it shows “A B C”.
Thank you for your assistance, it has given me a starting point and I
will have to spend some time experimenting and researching to reach
the final step.

You’re nearly there. Look a little closer at my suggestion,
particularly the second regex.

Thank you very much for your assistance using “A \n B \t
C”.gsub(/[^[:print:]]/, ‘’) gives me “A B C” which is what I was looking
for.
Can you recommend a good reference on regular expressions so I can learn
more?