I’m toying with GEDCOM again, and trying to write out the YAML
equivalent of a GEDCOM file (just for kicks, mainly).
It might look like this (edited for brevity):
0 @I1@ INDI
1 NAME Victoria /Hanover/
1 TITL Queen of England
1 FAMS @F1@
1 FAMC @F2@
0 @I2@ INDI
1 NAME Albert Augustus Charles//
1 TITL Prince
1 FAMS @F1@
1 FAMC @F3@
…
0 @F1@ FAM
1 HUSB @I2@
1 WIFE @I1@
1 CHIL @I3@
The stuff between @ symbols can be thought of as pointers (defined when
in front, referenced when in back). The natural way to store this in
ruby is to set up a hash for these pointer names while parsing and store
the actual object references. i.e. victoria[‘FAMS’] does not equal
“@F1@” but the actual family Hash. Now, when trying to dump the ultimate
data structures (which is made up entirely of Arrays, Hashes, and
Strings), you can see that by the very nature of the data almost
everything in the file will be referenced in one long link from the root
person, who is usually at the beginning. So basically the first entry in
the YAML top-level array would nestingly include every other individual
and family. This naturally makes for a very deep structure, which is
neither human-friendly nor stack-friendly. In fact even modest GEDCOM
files break the stack during to_yaml with this approach.
Is there no way to have yaml break deep nesting with an ID reference? Or
better yet, use an ID at any point other than the shallowest point where
that object occurs?