Hash or array?

Hello,

I have a question, I want to store some data in two columns, for example

0001, stuff1 stuff2
0002, morestuff
0003, extrastuff
0004, more evenmore

As you can see column 1 is an ID and column 2 is one or more strings.

I want to do a search across another data set to say if column 1 (ID
number) matches in both sets and the contents of column 2 to data set B.

I initially thought to use a hash, but it scrambles the order of the
data so was unsure of its efficiency and use.

Any suggestions?

Many thanks

On Mon, Sep 21, 2009 at 10:05 AM, Ne Scripter <
[email protected]> wrote:

Posted via http://www.ruby-forum.com/.

I’m not sure I completely understand what you are trying to do but with
hashes you could do the following:

search_value = 0003
if hash[search_value] == second_hash[search_value]
send ahead
end

if the key doesn’t exist the value will be nil so you might want a
special
case for that.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

On Mon, Sep 21, 2009 at 6:05 PM, Ne Scripter
[email protected] wrote:

I want to do a search across another data set to say if column 1 (ID
number) matches in both sets and the contents of column 2 to data set B.

You can do like this to obtain the keys that are common in both sets,
then iterate them doing something with the values (in my example add
the values from the second to the first, not sure if that’s what you
want):

irb(main):001:0> h1 = Hash[1,“a”,2,“b”,3,“c”]
=> {1=>“a”, 2=>“b”, 3=>“c”}
irb(main):002:0> h2 = Hash[1,“x”,3,“y”]
=> {1=>“x”, 3=>“y”}
irb(main):004:0> (h1.keys & h2.keys).each {|key| h1[key] += h2[key]}
=> [1, 3]
irb(main):005:0> h1
=> {1=>“ax”, 2=>“b”, 3=>“cy”}

I initially thought to use a hash, but it scrambles the order of the
data so was unsure of its efficiency and use.

If you want to keep the order then you will need to sort the keys, but
you can do that after the operation. About efficiency: the hash is
good for key access (amortized O(1)), although in this case it only
helps in the last step (adding the values), as you need to make the
intersection of all the keys first.

Hope this helps,

Jesus.

Hi,

Am Dienstag, 22. Sep 2009, 01:05:12 +0900 schrieb Ne Scripter:

I have a question, I want to store some data in two columns, for example

0001, stuff1 stuff2
0002, morestuff
0003, extrastuff
0004, more evenmore

I initially thought to use a hash, but it scrambles the order of the
data so was unsure of its efficiency and use.

Be aware that Array alway assigns all members from [0] to [max].

a = []
a[5] = :five
a #=> [nil, nil, nil, nil, nil, :five]

To keep the order in Hashes, sort the keys:

h = { 1 => :a, 2 => :b }
h.keys.sort { |k| h[ k] … }
# or even
h.sort.each { |k,v| … }

In Ruby 1.9 the order will be preserved, so just assign
ascendingly.

Bertram

I want to do a search across another data set to say if column 1 (ID
number) matches in both sets and the contents of column 2 to data set B.

I initially thought to use a hash, but it scrambles the order of the
data so was unsure of its efficiency and use.

Any suggestions?

Since everybody else is rooting for hashes, let’s talk arrays. You could
use a structure like this:

mydata = [nil, [stuff1, stuff2], [morestuff], [extrastruff], [more
evenmore]]

Then to search for matches, you can do:

searched1 = mydata[1] & datasetA & datasetB
searched2 = mydata[2] & datasetB

This is assuming that your datasets are arrays of course.

And it’s fine if you want to use hashes to store the key as id too:

mydata ={ 1 => [stuff1, stuff2], 2 =>[morestuff], 3 => [extrastruff], 4
=>[more evenmore] }

searched1 = mydata[1] & datasetA & datasetB

searched2 = mydata[2] & datasetB

On 21.09.2009 18:05, Ne Scripter wrote:

number) matches in both sets and the contents of column 2 to data set B.

I initially thought to use a hash, but it scrambles the order of the
data so was unsure of its efficiency and use.

Any suggestions?

I think you got some useful recommendations yet. I still have some
doubts whether I understand your requirements properly. If you want to
determine whether set B is a subset of A you can also do something like
this:

require ‘set’

YourData = Struct.new :id, :text

set_a = File.readlines(“a.dat”).
map {|l| l.chomp!; YourData.new(*line.split(/,/, 2)}.
to_set

set_b = File.readlines(“b.dat”).
map {|l| l.chomp!; YourData.new(*line.split(/,/, 2)}.
to_set

if set_a.superset? set_b
puts “all b are there”
end

Kind regards

robert

Look up ‘facet/dictionary’ – for a hash that can be sorted / keeps the
order.