Merging two arrays

Hi,

in Ruby, I have two arrays:

products = Array.new
products << [“Amazon”,121212,“Harry Potter”]
products << [“Amazon”,242424,“John Grisham”]
products << [“Amazon”,353535,“Michael Crichton”]

links = Array.new
links << [121212,“Amazon Book Clubs”]
links << [242424,“www.amazon.com/johngrisham”]
links << [353535,“www.amazon.com/somelink”]

I would like to merge these two arrays. The article number is to be the
identifier.

Thus, I would like an array like this as the result:
[“Amazon”,121212,“Harry Potter”,“Amazon Book Clubs”]

Any help greatly appreciated.

Cheers, Chris

On Sep 22, 2008, at 9:02 AM, Chris C. wrote:

links << [121212,“Amazon Book Clubs”]
Any help greatly appreciated.

Cheers, Chris

Treat links as an associative array, and use something like:

products.collect! {|p| p << links.assoc(p[1])[1] }

I.e., look at the docs for Array#assoc

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

On Mon, Sep 22, 2008 at 3:02 PM, Chris C. [email protected] wrote:

links << [121212,“www.amazon.com/abc”]
links << [242424,“www.amazon.com/johngrisham”]
links << [353535,“www.amazon.com/somelink”]

I would like to merge these two arrays. The article number is to be the
identifier.

Thus, I would like an array like this as the result:
[“Amazon”,121212,“Harry Potter”,“Amazon Book Clubs”]

I assume you mean you want an array of three entries like the above one:

irb(main):017:0> products.map {|x| x + links.assoc(x[1])[1…-1]}
=> [[“Amazon”, 121212, “Harry Potter”, “Amazon Book Clubs”],
[“Amazon”, 242424, “John Grisham”, “www.amazon.com/johngrisham”],
[“Amazon”, 353535, “Michael Crichton”, “www.amazon.com/somelink”]]

The method assumes that in the products array, the id is in index 1,
and in the links array it’s in index 0.
Anyway, if you need to do many operations based on the id I suggest
using a hash with the id as the key
and an array (or Struct) as the value.

Hope this helps,

Jesus.

Chris C. wrote:

I would like to merge these two arrays. The article number is to be
the identifier.

Thus, I would like an array like this as the result:
[“Amazon”,121212,“Harry Potter”,“Amazon Book Clubs”]

products.each do |product|
id = product[1]
links.each do |link|
if link[0] == id
product << link[1]
end
end
end

This will append the corresponding urls from the “links” array to the
individual arrays held in the “products” array.

However, since you are already using an id, you might consider using a
hash of hashes to store your data, for example:

products = {
121212 => {
:source => “Amazon”,
:author => “John Grisham”,
:url => nil
}
}

Later you can just change the :url value of an identified entry, using

products[121212][:url] = “www.amazon.com/johngrisham

This will avoid having your data scattered about over numerous arrays
and having to remember which field in which array holds the id value.
Using hashes might be more straightforward in this case.

Henning

Chris C. wrote:

I would like to merge these two arrays. The article number is to be the
identifier.

Thus, I would like an array like this as the result:
[“Amazon”,121212,“Harry Potter”,“Amazon Book Clubs”]

Are these two arrays already aligned and of the same size? If so just
iterate using each_with_index, or look at Array#zip

If they are not aligned, or there are items in one list which are not in
the other, then you need to be clearer about your requirements.

  • What do you want to happen if an item is in ‘products’ but not in
    ‘links’, or vice versa?

  • What do you want to happen if there are multiple items in the ‘links’
    list which match an item in the ‘products’ list, or vice versa?

One possibility:

  • iterate across products.
  • a product with no link gives a nil in the link column.
  • a link with no product is ignored.
  • if there are multiple links matching a product, only the first is
    used.

In that case you could just write:

products.each do |product|
link = links.find { |link| link[0] == product[1] }
product[3] = link ? link[1] : nil
end

If your links list is huge, it may be worth building a hash of
article_number=>url first, rather than doing a linear search of the
links array each time round.

But you may wish to consider a couple of other things first:

  • make a class for Product and Link. Then you can say product.ref
    instead of product[1], link.url instead of link[1] etc. The code will be
    much easier to read.

  • consider whether your article numbers are guaranteed unique across all
    suppliers. That is, is there any chance that Amazon have a book 121212
    and Waterstones have a different book that they also call 121212?

If so, maybe you want to change your links structure to

links << [“amazon”,121212,“Amazon Book Clubs”]

In database-speak, (“amazon”,121212) is a composite key.

If the article reference is an ISBN then that may be irrelevant. But
ISBNs need to be strings I believe, due to use of the “X” character.