Doing an AND in regexp char class

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb…

s1 = “hello there”
s2 = “ohi”
(s2.unpack(‘c*’) & s1.unpack(‘c*’)).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I’m
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

On May 8, 2008, at 3:40 PM, Todd B. wrote:

s2 = “ohi”
(s2.unpack(‘c*’) & s1.unpack(‘c*’)).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I’m
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

cfp:~ > cat a.rb
class String
def all_chars? chars
tr(chars, ‘’).empty?
end
end

p ‘foobar’.all_chars?(‘rabof’)
p ‘foobar’.all_chars?(‘abc’)
p ‘foobar’.all_chars?(‘’)

cfp:~ > ruby a.rb
true
false
false

a @ http://codeforpeople.com/

On Thu, May 8, 2008 at 6:07 PM, ara.t.howard [email protected]
wrote:

Using a set perspective, I can do it like this in irb…

p ‘foobar’.all_chars?(‘rabof’)
p ‘foobar’.all_chars?(‘abc’)
p ‘foobar’.all_chars?(‘’)

cfp:~ > ruby a.rb
true
false
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently
forget about.

tkx fur realizashuns,
Todd

On May 8, 2008, at 5:30 PM, Todd B. wrote:

tr is one of those useful methods I somehow consistently forget about

me too. just got lucky this time :wink:

a @ http://codeforpeople.com/

On Thu, May 8, 2008 at 5:40 PM, Todd B. [email protected] wrote:

s2 = “ohi”
(s2.unpack(‘c*’) & s1.unpack(‘c*’)).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I’m
wondering is if there is a way to do this with a simple regexp.

REs can do this, but may not be the best way. The way that comes to
mind is to see if the string matches the characters in any order,
i.e. for “ohi” either ohi, oih, hio, hoi, iho, or ioh
so something like

/(o([^h]*h[^i]i|[^i]*i[^h]*h)|(h([^i]*i[^o]o|[^o][^i]*i)|o([^h]*h[^o]*o|[^o]*o[^h]*h)/

meaning

o followed by either
zero or more non-h’s folllowed by an h followed by zero or more
non-i’s folllowed by an i
or
zero or more non-i’s followed by an i followed by zero or more
non-h’s followed by an h
or
h followed by either

I would be possible to generate such an RE from the string.

But maybe someone cleverer with REs has a better approach.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Thu, May 8, 2008 at 7:00 PM, Joel VanderWerf
[email protected] wrote:

I can check with a character class if one of the characters in the

tr(chars, ‘’).empty?
true
if chars.empty?
empty?
else
/\A[#{chars}]*\z/ === self
end
end
end

p ‘foobar’.all_chars?(‘rabof’) # => true
p ‘foobar’.all_chars?(‘abc’) # => false
p ‘foobar’.all_chars?(‘’) # => false

I’m drawing a blank here with this one. Why doesn’t this work then…

irb(main):006:0> r = /\A[oh]\z/
=> /\A[oh]
\z/
irb(main):007:0> s = “hello, there”
=> “hello, there”
irb(main):008:0> r === s
=> false

Todd

Todd B. wrote:

Thanks,
p ‘foobar’.all_chars?(’’)

cfp:~ > ruby a.rb
true
false
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget about.

But it can be done with regex, right? It’s just more elegant with tr.

class String
def all_chars? chars
if chars.empty?
empty?
else
/\A[#{chars}]*\z/ === self
end
end
end

p ‘foobar’.all_chars?(‘rabof’) # => true
p ‘foobar’.all_chars?(‘abc’) # => false
p ‘foobar’.all_chars?(’’) # => false

Todd B. wrote:

I’m drawing a blank here with this one. Why doesn’t this work then…

irb(main):006:0> r = /\A[oh]\z/
=> /\A[oh]
\z/
irb(main):007:0> s = “hello, there”
=> “hello, there”
irb(main):008:0> r === s
=> false

Maybe I’m confused about was wanted originally. The above tests the
following condition:

(set of chars occurring in given string)
is_a_subset_of
(given set of chars).

irb(main):007:0> /\A[oh]\z/ === “hohoho”
=> true
irb(main):008:0> /\A[oh]
\z/ === “ho ho”
=> false

If you want superset instead of subset, this works:

irb(main):013:0> /(?=.*h)(?=.*o)/ === “h o”
=> true

Hi –

On Fri, 9 May 2008, Joel VanderWerf wrote:

Maybe I’m confused about was wanted originally. The above tests the following

If you want superset instead of subset, this works:

irb(main):013:0> /(?=.*h)(?=.*o)/ === “h o”
=> true

That depends on the order, though. To do the superset test, you could
just do the subset, but in the other direction: check that the
character class, as a string, doesn’t contain anything that isn’t in
the main string:

str = “h o”
chars = “ho”

/\A[#{str}]*\z/ === chars # true

(though probably best to uniquify the string first).

David

Hi –

On Fri, 9 May 2008, Todd B. wrote:

=> false
def all_chars? chars
cfp:~ > ruby a.rb
def all_chars? chars
p ‘foobar’.all_chars?(’’) # => false

I’m drawing a blank here with this one. Why doesn’t this work then…

irb(main):006:0> r = /\A[oh]\z/
=> /\A[oh]
\z/
irb(main):007:0> s = “hello, there”
=> “hello, there”
irb(main):008:0> r === s
=> false

“hello, there” contains letters other than o and h, but your regex
calls for a string consisting of zero or more o’s or h’s and nothing
else.

I think there might be some confusion as between determining that a
string contains certain characters, and determining that a string
contains only certain characters. My understanding was that you
wanted the first, which you could do with tr but I think you’d
probably want the character cluster to be doing the tr’ing:

“oh”.tr(“hello, there”,"").empty? # true; all letters in “oh”
# are also in “hello, there”
“hello, there”.tr(“ho”,"").empty? # false

They’re both strings, of course, so you can do either with Ara’s
or Joel’s methods:

“oh”.all_chars?(“hello, there”) # true
“hello, there”.all_chars?(“oh”) # false

though if it’s really the former you want you might want to name it
all_present_in? or something.

David

Hi –

On Fri, 9 May 2008, Joel VanderWerf wrote:

irb(main):004:0> /(?=.*h)(?=.*o)/m === “o \nh”
=> true

Does that fix the order problem you were thinking of?

Actually I think I was wrong about the order mattering (since they’re
zero-width). But /m helps anyway. I still think you could just change
the roles of the two strings and dissect “the string” as a character
class and “the characters” as a string, and use your original
technique.

David

David A. Black wrote:

irb(main):013:0> /(?=.*h)(?=.*o)/ === “h o”
=> true

That depends on the order, though.

Yes, it’s buggy. Should use //m:

irb(main):003:0> /(?=.*h)(?=.*o)/ === “o \nh”
=> false
irb(main):004:0> /(?=.*h)(?=.*o)/m === “o \nh”
=> true

Does that fix the order problem you were thinking of?

Joel VanderWerf wrote:

Todd B. wrote:

Thanks,
p ‘foobar’.all_chars?(’’)

cfp:~ > ruby a.rb
true
false
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget about.

But it can be done with regex, right? It’s just more elegant with tr.

class String
def all_chars? chars
if chars.empty?
empty?
else
/\A[#{chars}]*\z/ === self
end
end
end

p ‘foobar’.all_chars?(‘rabof’) # => true
p ‘foobar’.all_chars?(‘abc’) # => false
p ‘foobar’.all_chars?(’’) # => false

Your method doesn’t work, which can clearly be seen in these examples:

strs = [“aaa”, “bbb”, “ccc”]
chars = “abc”

strs.each do |str|

if /\A[#{chars}]*/ =~ str
print str, " - yes"
puts
else
print str, " - no"
puts
end

end

–output:–
aaa - yes
bbb - yes
ccc - yes

It should be clear from the output that even though the string “aaa”
passes your test, it is not true that all the characters in the string
“abc” appear in in the string “aaa”.

On Thu, May 8, 2008 at 7:26 PM, Joel VanderWerf
[email protected] wrote:

Maybe I’m confused about was wanted originally. The above tests the
following condition:

(set of chars occurring in given string)
is_a_subset_of
(given set of chars).

Yep. The subject title is misleading, because the AND is already
there [^ho] means not h and also not o.

I was looking to find if given a string A, can I say whether or not
all of the characters in string A exist in string B (count doesn’t
matter, just existence). All of you gave me some good answers that I
hadn’t thought of. Good brain food :slight_smile:

Todd

Hi –

On Fri, 9 May 2008, 7stud – wrote:

false
else

puts
passes your test, it is not true that all the characters in the string
“abc” appear in in the string “aaa”.

Do it the other way around (and don’t forget the \z):

if /\A[#{str}]*\z/ =~ chars

It’s really the characters in str that you’re testing, to make sure
that none of them fail to match the characters in chars. If the
variable names seem backwards, you can change them. It’s the logic
that’s important, and it works fine.

David

David A. Black wrote:

Hi –

On Fri, 9 May 2008, 7stud – wrote:

false
else

puts
passes your test, it is not true that all the characters in the string
“abc” appear in in the string “aaa”.

Do it the other way around (and don’t forget the \z):

Whoops.

if /\A[#{str}]*\z/ =~ chars

It’s really the characters in str that you’re testing, to make sure
that none of them fail to match the characters in chars. If the
variable names seem backwards, you can change them. It’s the logic
that’s important, and it works fine.

Nice.

2008/5/9 ara.t.howard [email protected]:

class String
def all_chars? chars
tr(chars, ‘’).empty?
end
end

Using String#tr is nice, but the result is not what Todd wants:

s1 = “hello there”
s2 = “ohe”

(s2.unpack(‘c*’) & s1.unpack(‘c*’)).size == s2.size
=> true

class String
def all_chars? chars
tr(chars, ‘’).empty?
end
end

s1.all_chars?(s2)
=> false

Like in the regexp examples, you have to switch self and chars:

class String
def all_chars? chars
chars.tr(self, ‘’).empty?
end
end

s1.all_chars?(s2)
=> true

Regards,
Pit

On Fri, May 9, 2008 at 1:50 AM, ara.t.howard [email protected]
wrote:

me too. just got lucky this time :wink:
Knowledge → the art of getting lucky very often, right Ara :wink:

we can deny everything, except that we have the possibility of being better.
simply reflect on that.
h.h. the 14th dalai lama
BTW when I was referring to the quote I learnt most about I was
thinking about “Be kind whenever it is possible. It is always
possible”.

Not that I dislike the others or apply any judgment I just wanted to
be clear that I personally learnt the most from the above :slight_smile:

Cheers
Robert


http://ruby-smalltalk.blogspot.com/


Whereof one cannot speak, thereof one must be silent.
Ludwig Wittgenstein