INNER JOINING Two Arrays

dandiebolt · August 24, 2009, 6:23pm

I have two arrays a and b which I need to iterate through simultaneously
creating essentially an “INNER JOIN” on the values of the two arrays.
Both arrays contain integers and are sorted in ascending order. The
first value in each array is the same and the last value in each array
is the same. On each iteration through the arrays I need to obtain the
indexes of each array if the array values are equal to each other.

This is the code I have which works:

a=[1,2,5,13]
b=[1,1,2,2,2,5,13,13,13]

indexs=[]
items=[]
a.each_with_index do |itema,indexa|
Â b.each_with_index do |itemb,indexb|
Â if (itema==itemb) then
Â Â puts “#{indexa} #{indexb}”
Â Â indexs << [indexa,indexb]
Â Â itemsÂ << [itema,itemb]
Â else
Â Â next
Â end
Â end
end

producing this output:

0 0
0 1
1 2
1 3
1 4
2 5
3 6
3 7
3 8

and these values of indexs and items:

irb(main):017:0> indexs
=> [[0, 0], [0, 1], [1, 2], [1, 3], [1, 4], [2, 5], [3, 6], [3, 7], [3,
8]]

irb(main):018:0> items
=> [[1, 1], [1, 1], [2, 2], [2, 2], [2, 2], [5, 5], [13, 13], [13, 13],
[13, 13]]

Q1: Is there a more idiomatic and concise way of doing this using
functional array methods?Â

Also note that if you interchanged the two each loops they would produce
the same output and results:

b.each_with_index do |itemb,indexb|
Â a.each_with_index do |itema,indexa|
Â …
Â end
end

Q2: Are there any observations one could make about which scenario would
be faster both for my code and a more idiomatic solution you might have?

dandiebolt · August 24, 2009, 7:15pm

On 24.08.2009 18:22, Dan wrote:

Q2: Are there any observations one could make about which scenario
would be faster both for my code and a more idiomatic solution you
might have?

Your algorithm has effort O(nm) while with the given preconditions you
should be able to achieve O(n+m) instead. Less formal speaking, your
version compares each possible pair (all combinations, hence nm) while
more efficient approaches exist.

Hint (because this seems to be some kind of assignment): you cannot use
Ruby’s iteration via #each but should use array indexes.

Kind regards

robert

dandiebolt · August 24, 2009, 8:12pm

I’d do something like this:

b=[1,1,2,2,2,5,13,13,13]
data = Hash.new{|hash, key| hash[key] = []}

b.each_with_index do |num, i|
data[num] << i
end

a=[1,2,5,13]

indexes = []
items = []

a.each_with_index do |num, i|
indexes_for_num = data[num]
arrs = indexes_for_num.map{|index| [i, index]}
indexes.push(*arrs)

indexes_for_num.length.times{items << [num, num]}
end

p indexes
p items

–output:–
[[0, 0], [0, 1], [1, 2], [1, 3], [1, 4], [2, 5], [3, 6], [3, 7], [3, 8]]
[[1, 1], [1, 1], [2, 2], [2, 2], [2, 2], [5, 5], [13, 13], [13, 13],
[13, 13]]

For small arrays, like the ones in your example, the speeds will be
close, but as the arrays get bigger, the hash lookups will be much
faster, like 10 times faster.

dandiebolt · August 25, 2009, 2:21pm

hello
you could also use index and rindex
http://www.ruby-doc.org/core/classes/Array.html#M002178
http://www.ruby-doc.org/core/classes/Array.html#M002179