Building a data structure - Hashes of Hashes of Hashes of Arrays

Good morning everyone,

This week at work I had the challenge of parsing a specific file format
that contains IP ranges categorized by different Sites, Areas and
Regions.
Basically I needed a script to load all this location information into a
data structure that would allow me an easy way of obtaining all the IPs
of sites, areas or regions for a later transformation.

Required data structure:
data[Region][Area][Site] -> IPs
Hash Hash Hash Array

I would like to know if the function “processLocations” could be
optimized or if there exists a simpler way of achieving the desired data
structure. Specially in the creation of the “Hashes of Hashes of Hashes
of Arrays” region variable.

Hope this can also help someone else in the same situation, so here is a
my current working copy:

require ‘pp’

Function that processes the content of the locations file and returns

the following structure:

data[Region][Area][Site] -> IPs

Hash Hash Hash Array

def processLocations (lines)
sites = Hash.new{|h, k| h[k] = []} # HashOFArray
area = Hash.new{|h,k| h[k]=Hash.new(&h.default_proc)} # HashOFHash
region = Hash.new{|h,k| h[k]=Hash.new(&h.default_proc)} # HashOFHash

lines.each do |line|
next if lines =~ /^#.*/

# Process IPs range section
if line =~ /(.*)=([\d|\-|\.]+)/
  #puts "IP: #{$1} - #{$2}"
  sites[$1.chomp.capitalize] << $2
end

# Process area section
if line =~ /(.*)\.area=(.*)/i
  #puts "Area: #{$1} - #{$2}"
  if sites.has_key?($1.chomp.capitalize)
    #puts "A: #{$2.chomp.capitalize} - #{$1.chomp.capitalize} -

#{sites.class} - #{sites.keys} - #{sites[$1.chomp.capitalize].class}"

    if (area.has_key?($2.chomp.capitalize) &

area[$2.chomp.capitalize].has_key?($1.chomp.capitalize))
# The hash exists
#puts “Adding to an existing hash key more IPs elements to the
array”
area[$2.chomp.capitalize][$1.chomp.capitalize] <<
sites[$1.chomp.capitalize]
else
# The hash does not exist
#puts “Adding new hash key with new array”
area[$2.chomp.capitalize][$1.chomp.capitalize] =
sites[$1.chomp.capitalize]
end

    # Clean site hash
    sites = Hash.new{|h, k| h[k] = []} # HashOFArray
  end
end

# Process region section
if line =~ /(.*)\.region=(.*)/i
  #puts "Region: #{$1} - #{$2}"
  if area.has_key?($1.chomp.capitalize)
    #puts "R: #{$2.chomp.capitalize} - #{$1.chomp.capitalize} -

#{area.class} - #{area.keys} - #{area[$1.chomp.capitalize].class} -
#{area[$1.chomp.capitalize].keys}"
tmp = Hash.new
tmp = area.dup

    region[$2.chomp.capitalize][$1.chomp.capitalize] =

tmp[$1.chomp.capitalize]
end
end
end
return region
end

##############

MAIN

f = File.open(DATA)
lines = f.readlines
f.close
data = processLocations(lines)

puts “+data---------------------------------------------------------”
pp data

puts “+data[‘Asia’]-------------------------------------------------”
pp data[‘Asia’]

puts “+data[‘Asia’][‘Australia’]------------------------------------”
pp data[‘Asia’][‘Australia’]

puts “+data[‘Europe-middle east-africa’][‘France’][‘Paris’]---------”
pp data[‘Europe-middle east-africa’][‘France’][‘Paris’]

END
Alexandria (ALH)=192.168.6.0-192.168.6.127
Alexandria (ALH).area=Australia
Australia.region=Asia

Altona=192.168.1.192-192.168.1.255
Altona=192.168.2.192-192.168.2.255
Altona.area=Australia

TOKYO VPN=192.168.3.192-192.168.3.255
TOKYO VPN.area=JAPAN
JAPAN.region=Asia

Paris=192.168.4.192-192.168.4.255
Paris.area=France

Rennes=192.168.5.192-192.168.5.255
Rennes.area=France
France.region=EUROPE-MIDDLE EAST-AFRICA

Example output:

ruby ruby_help.rb

+data---------------------------------------------------------
{“Asia”=>
{“Australia”=>
{“Alexandria (alh)”=>[“192.168.6.0-192.168.6.127”],
“Altona”=>[“192.168.1.192-192.168.1.255”,
“192.168.2.192-192.168.2.255”]},
“Japan”=>{“Tokyo vpn”=>[“192.168.3.192-192.168.3.255”]}},
“Europe-middle east-africa”=>
{“France”=>
{“Paris”=>[“192.168.4.192-192.168.4.255”],
“Rennes”=>[“192.168.5.192-192.168.5.255”]}}}
+data[‘Asia’]-------------------------------------------------
{“Australia”=>
{“Alexandria (alh)”=>[“192.168.6.0-192.168.6.127”],
“Altona”=>[“192.168.1.192-192.168.1.255”,
“192.168.2.192-192.168.2.255”]},
“Japan”=>{“Tokyo vpn”=>[“192.168.3.192-192.168.3.255”]}}
+data[‘Asia’][‘Australia’]------------------------------------
{“Alexandria (alh)”=>[“192.168.6.0-192.168.6.127”],
“Altona”=>[“192.168.1.192-192.168.1.255”,
“192.168.2.192-192.168.2.255”]}
+data[‘Europe-middle east-africa’][‘France’][‘Paris’]---------
[“192.168.4.192-192.168.4.255”]

Regards and thanks in advance for any suggestions,
Sebastian YEPES

Hi there.

Hard to read all your text… Consider to write enough… but not all.

You have to declare your variable in this way:

a = Hash.new{|h, k| h[k] = Hash.new(&h.default_proc)}

So, for example:

a = Hash.new{|h, k| h[k] = Hash.new(&h.default_proc)}
=> {}

b = [1,2,3,4,5]
=> [1, 2, 3, 4, 5]

a[“2011”][“november”][“tuesday”] = b
=> [1, 2, 3, 4, 5]

puts a[“2011”][“november”][“tuesday”]
1
2
3
4
5
=> nil

puts a
{“2011”=>{“november”=>{“tuesday”=>[1, 2, 3, 4, 5]}}}
=> nil

On Nov 8, 2011, at 13:48 , Cassna Capriet wrote:

You have to declare your variable in this way:

a = Hash.new{|h, k| h[k] = Hash.new(&h.default_proc)}

I prefer a more readable approach:

class NestedHash < Hash
def initialize
super { |h,k| h[k] = NestedHash.new }
end
end

a = NestedHash.new