Is there a built-in method for identifying character types a la
ctype() in C? I would like to avoid requiring a gem dependency.
No other than regexp as far as I know.
def alnum?(c); ctype(c) == :alnum end
def alpha?(c); ctype(c) == :alpha end
etc…
What do you need that for? Why not directly use a regexp to match a
string? Often you can use capturing groups in a single regexp, e.g.
irb(main):009:0> %w{foo bar 123}.each do |s|
irb(main):010:1* if /(\d+)|(\w+)/ =~ s
irb(main):011:2> puts “number” if $1
irb(main):012:2> puts “chars” if $2
irb(main):013:2> end
irb(main):014:1> end
chars
chars
number
=> [“foo”, “bar”, “123”]
Is there a built-in method for identifying character types a la
ctype() in C? I would like to avoid requiring a gem dependency.
chars
number
=> [“foo”, “bar”, “123”]
That’s very cool. I’ll have to remember that one.
I do actually need to test if a character is of a certain type, all by
itself, as part of a parser I’m working on. I came up with this quick
solution for now, Character type identification module · GitHub
irb(main):010:1* if /(\d+)|(\w+)/ =~ s
irb(main):011:2> puts “number” if $1
irb(main):012:2> puts “chars” if $2
irb(main):013:2> end
irb(main):014:1> end
chars
chars
number
=> [“foo”, “bar”, “123”]
That’s very cool. I’ll have to remember that one.
It is a bit brittle, though. Watch:
%w{foo bar 123 345bar}.each do |s|
if /(\d+)|(\w+)/ =~ s
puts “number” if $1
puts “chars” if $2
end
end
#=>
chars
chars
number
number
I do actually need to test if a character is of a certain type, all by
itself, as part of a parser I’m working on. I came up with this quick
solution for now, Character type identification module · GitHub
Thanks,
Ammar
This is similarily fragile. If you are only testing for one character,
it seems good, if you are going to test for strings, it will not be
enough.
Also, the whole solution is not necessarily fast - I do love regular
expressions, but I would search on for other methods to meet your
requirements. It really depends on what your goal is. If it is about
bit streams, you might want to look into bit-struct or similar
approaches that use Array#pack and String#unpack.
irb(main):009:0> %w{foo bar 123}.each do |s|
That’s very cool. I’ll have to remember that one.
chars
chars
number
number
Yes, of course. The regexp was a quick hack only to demonstrate the
mechanism. You can easily fix that by proper anchoring the regexp.
requirements. It really depends on what your goal is. If it is about
bit streams, you might want to look into bit-struct or similar
approaches that use Array#pack and String#unpack.
If I have to implement a parser manually I would typically use regexp
for scanning. You can even make this fairly readable by using /x.
input.scan %r{
white space
(\s+)
integer
([-+]?\d+)
keyword: if
(if)
etc
}x do |m|
…
end
Of course, if the number of tokens is large this will become very
awkward. Better use a proper parser generator then.
This is similarily fragile. If you are only testing for one character,
it seems good, if you are going to test for strings, it will not be
enough.
I do only need to test individual characters. Extending the code to
match entire strings should be easy:
/^[[:alnum:]]+$/
Also, the whole solution is not necessarily fast - I do love regular
expressions, but I would search on for other methods to meet your
requirements. It really depends on what your goal is. If it is about
bit streams, you might want to look into bit-struct or similar
approaches that use Array#pack and String#unpack.
I agree, and wish this was built-in. Interesting suggestions. Thanks.
Regards,
Ammar
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.