Using grep on subarrays - help!

dubstep · April 3, 2011, 3:32pm

Can anyone help with this? I thought grep would find any element that
matches in an array. It seems not…

irb(main):028:0> test = [[‘one’, ‘vol1’], [‘one’, ‘vol2’], [‘two’,
‘vol3’]]
=> [[“one”, “vol1”], [“one”, “vol2”], [“two”, “vol3”]]
irb(main):029:0> test.grep(/one/)
=> []
irb(main):030:0> test.each.grep(/one/)
=> []
irb(main):031:0> test
=> [[“one”, “vol1”], [“one”, “vol2”], [“two”, “vol3”]]
irb(main):032:0> test.grep(‘one’)
=> []
irb(main):033:0> test2 = [‘one’, ‘one’, ‘two’, ‘three’]
=> [“one”, “one”, “two”, “three”]
irb(main):034:0> test2.grep(/one/)
=> [“one”, “one”]
irb(main):035:0>

simonh · April 3, 2011, 3:48pm

On Sun, Apr 3, 2011 at 3:32 PM, Simon H. [email protected]
wrote:

Can anyone help with this? I thought grep would find any element that
matches in an array. It seems not…

irb(main):028:0> test = [[‘one’, ‘vol1’], [‘one’, ‘vol2’], [‘two’,
‘vol3’]]
=> [[“one”, “vol1”], [“one”, “vol2”], [“two”, “vol3”]]
irb(main):029:0> test.grep(/one/)
=> []

test.each {|x| puts x.grep /one/}

=> one
one

If you want to choose all pairs that match:

result = []
test.each {|x| result << x unless x.grep(/one/).empty?}
result

=> [[“one”, “vol1”],[“one”,“vol2”]]

Jesus.

simonh · April 3, 2011, 3:58pm

Thanks again Jesus. Is it just me or does that seem a bit longwinded.
Any idea why one can’t just directly search an array with subarrays?

simonh · April 3, 2011, 11:13pm

Simon H. wrote in post #990675:

Can anyone help with this? I thought grep would find any element that
matches in an array. It seems not…

You are trying to use grep() to match strings, yet your test array does
not contain strings–it contains subarrays. So first you have to grab
the sub-arrays, and then you can apply grep to the subarrays because
they contain strings.

But, there is no reason to use grep() if you are looking for strings:

test = [
[‘one’, ‘vol1’],
[‘one’, ‘vol2’],
[‘two’, ‘vol3’]
]

target = “one”

new_arr = test.select do |arr|
if arr.include?(target)
true
end
end

p new_arr

–output:–
[[“one”, “vol1”], [“one”, “vol2”]]

simonh · April 3, 2011, 4:44pm

2011/4/3 Jesús Gabriel y Galán [email protected]

result = []
test.each {|x| result << x unless x.grep(/one/).empty?}
result

More to-the-point:

test.reject { |x| x.grep(/one/).empty? }

simonh · April 4, 2011, 11:13am

On Sun, Apr 3, 2011 at 4:44 PM, Adam P. [email protected]
wrote:

2011/4/3 Jess Gabriel y Galn [email protected]

result = []
test.each {|x| result << x unless x.grep(/one/).empty?}
result

More to-the-point:

test.reject { |x| x.grep(/one/).empty? }

It seems we may want to match at the first position only, so I’d
rather do one of those

irb(main):003:0> test.select {|a,b| /one/ =~ a}
=> [[“one”, “vol1”], [“one”, “vol2”]]
irb(main):004:0> test.select {|a,b| a == “one”}
=> [[“one”, “vol1”], [“one”, “vol2”]]

If we are only interested in the other bit

irb(main):005:0> test.select {|a,b| a == “one”}.map {|a,b| b}
=> [“vol1”, “vol2”]

Or, with #inject for a change since we know the key when doing exact
matches

irb(main):006:0> test.inject([]) {|r,(a,b)| a == “one” ? r << b : r}
=> [“vol1”, “vol2”]

Kind regards

robert

simonh · April 4, 2011, 9:05pm

Thanks for all the tips. I think #select fits best:

file_results = @films.select { |a, b| /#{@film}/i =~ a }

Just one question. Obviously in the above, a and b refer to the first
and second elements in each subarray. Let’s say we have this array:

=> [[“one”, “two”, “three”], [“one”], [“two”, “three”], [“one”,
“three”]]

What would be the best way to search for ‘one’ (or whatever) in this
case?

Cheers

simonh · April 3, 2011, 11:49pm

7stud – wrote in post #990727:

new_arr = test.select do |arr|
if arr.include?(target)
true
end
end

I guess that can be written more succinctly as:

new_arr = test.select do |arr|
arr.include?(target)
end

The problem with grep() is that it returns an empty array when it finds
no matches, so you have to check if the array is empty.

simonh · April 4, 2011, 9:44pm

On Apr 4, 2011, at 3:06 PM, Simon H. wrote:

What would be the best way to search for ‘one’ (or whatever) in this
case?

Cheers

–
Posted via http://www.ruby-forum.com/.

irb> @films = [[“one”, “two”, “three”], [“one”], [“two”, “three”],
[“one”, “three”]]
=> [[“one”, “two”, “three”], [“one”], [“two”, “three”], [“one”,
“three”]]
irb> @film = ‘One’
=> “One”
irb> @films.select{|film,*_| /#{@film}/i =~ film}
=> [[“one”, “two”, “three”], [“one”], [“one”, “three”]]

_ is a valid identifier and I tend to use it to mean either “a
throwaway placeholder” (like in this usage) or “a quick, single block
arg” (as in arry.map{||.something} to mean the same as
arry.map(&:something) without needing Symbol#to_proc).

-Rob

Rob B.
[email protected] http://AgileConsultingLLC.com/
[email protected] http://GaslightSoftware.com/

simonh · April 4, 2011, 10:05pm

Thanks Rob,very useful. Any chance of expanding on this at all:

“a quick, single block
arg” (as in arry.map{||.something} to mean the same as
arry.map(&:something) without needing Symbol#to_proc)."

simonh · April 4, 2011, 10:59pm

On Apr 4, 2011, at 4:05 PM, Simon H. wrote:

Thanks Rob,very useful. Any chance of expanding on this at all:

“a quick, single block
arg” (as in arry.map{||.something} to mean the same as
arry.map(&:something) without needing Symbol#to_proc)."

–
Posted via http://www.ruby-forum.com/.

Here’s a little example from an error handler in a script (first
occurrence in an open buffer):

rescue => e
if $stderr.isatty
$stderr.puts e.message
$stderr.puts e.backtrace.select {|_| %r{/app/} =~ _}.join(“\n\t”)
else
raise
end
end

While I could replace the “_” with a name like “line” or
“backtrace_line”, that doesn’t really increase the expressiveness when
the entire block is one statement/expression.

-Rob

Rob B.
[email protected] http://AgileConsultingLLC.com/
[email protected] http://GaslightSoftware.com/

simonh · April 5, 2011, 1:41am

Simon H. wrote in post #990886:

Thanks for all the tips. I think #select fits best:

file_results = @films.select { |a, b| /#{@film}/i =~ a }

Just one question. Obviously in the above, a and b refer to the first
and second elements in each subarray. Let’s say we have this array:

=> [[“one”, “two”, “three”], [“one”], [“two”, “three”], [“one”,
“three”]]

Neither grep() nor include?() depended on the size of the sub-arrays
they are searching, so those answers don’t change:

data = [
[“one”, “two”, “three”],
[“one”],
[“two”, “three”],
[“one”,“three”]
]

target = ‘one’

results = data.select do |arr|
arr.include?(target)
end

p results

–output:–
[[“one”, “two”, “three”], [“one”], [“one”, “three”]]

However, if your target will only appear in the first position of the
array, then it is much more efficient to just check the first element of
each array:

data = [
[“one”, “two”, “three”],
[“ONE”],
[“two”, “three”],
[“oNe”,“three”]
]

target = ‘one’

results = data.select do |arr|
arr[0].downcase == target
end

p results

–output:–
[[“one”, “two”, “three”], [“ONE”], [“oNe”, “three”]]

Personally, I try to avoid regexes whenever possible in order to make
the code clearer–and it is usually more efficient.