Using YAML?

bodikp · August 10, 2006, 6:06pm

I have a spreadsheet with 3 columns in it, and 174 rows. I need to
cross-reference the items in one column with their corresponding entry
in the other column. Here is a sample with the first 3 entries in this
table.

File Prefix Service Code Publication Title
aacmc7p AACM00 Affirmative Action Compliance Manual
acm059p BACM00 BNA/ACCA Compliance Manual
acm061p BACM00 BNA/ACCA Compliance Manual, Interim Index
Update

I need to look at filenames coming into a directory, see what their
“file prefix” is, based on column 1 above, and then notate in a separate
variable the contents of column 2. I’m needing to ascertain what the
company accounting code is for files being sent to our printer.

Should I do this with just an array, or could I use YAML?

Thanks a lot,
Peter

bodikp · August 10, 2006, 7:16pm

Peter B. wrote:

I have a spreadsheet with 3 columns in it, and 174 rows. I need to
cross-reference the items in one column with their corresponding entry
in the other column. Here is a sample with the first 3 entries in this
table.

File Prefix Service Code Publication Title
aacmc7p AACM00 Affirmative Action Compliance Manual
acm059p BACM00 BNA/ACCA Compliance Manual
acm061p BACM00 BNA/ACCA Compliance Manual, Interim Index
Update

I need to look at filenames coming into a directory, see what their
“file prefix” is, based on column 1 above, and then notate in a separate
variable the contents of column 2. I’m needing to ascertain what the
company accounting code is for files being sent to our printer.

Should I do this with just an array, or could I use YAML?

Well, YAML is a serialisation format so it is really not
something you would use as runtime data storage. An Array
or a Hash would certainly work.

If I misunderstood you and you were looking for a file format,
then YAML would work great. When you YAML.load the file, you
will get your Array/Hash and you can also YAML.dump data into
the file. The file format would look like this:

- acmcp7p
- AACM00
- Affirmative # …

…

Or this:

prefix: acmcp7p
code: AACM00
title: Affirmative #…
prefix: # …

…

Thanks a lot,
Peter

bodikp · August 10, 2006, 7:20pm

Eero S. wrote:

(…and forgot to add some information)

Peter B. wrote:

I have a spreadsheet with 3 columns in it, and 174 rows. I need to
cross-reference the items in one column with their corresponding entry
in the other column. Here is a sample with the first 3 entries in this
table.

File Prefix Service Code Publication Title
aacmc7p AACM00 Affirmative Action Compliance Manual
acm059p BACM00 BNA/ACCA Compliance Manual
acm061p BACM00 BNA/ACCA Compliance Manual, Interim Index
Update

I need to look at filenames coming into a directory, see what their
“file prefix” is, based on column 1 above, and then notate in a separate
variable the contents of column 2. I’m needing to ascertain what the
company accounting code is for files being sent to our printer.

Should I do this with just an array, or could I use YAML?

Well, YAML is a serialisation format so it is really not
something you would use as runtime data storage. An Array
or a Hash would certainly work.

If I misunderstood you and you were looking for a file format,
then YAML would work great. When you YAML.load the file, you
will get your Array/Hash and you can also YAML.dump data into
the file. The file format would look like this:

acmcp7p

AACM00

Affirmative # …

…

This would give you an Array of Arrays:

[[“acmcp7p”, “AACM00”, “Affirmative”], […], …]

Or this:

prefix: acmcp7p
code: AACM00
title: Affirmative #…

prefix: # …

…

This would give an Array of Hashes

[{prefix => “acmcp7p”,
code => “AACM00”,
title => “Affirmative…”},
{…},
…]

Thanks a lot,
Peter

bodikp · August 10, 2006, 7:50pm

Eero S. wrote:

Peter B. wrote:

I have a spreadsheet with 3 columns in it, and 174 rows. I need to
cross-reference the items in one column with their corresponding entry
in the other column. Here is a sample with the first 3 entries in this
table.

File Prefix Service Code Publication Title
aacmc7p AACM00 Affirmative Action Compliance Manual
acm059p BACM00 BNA/ACCA Compliance Manual
acm061p BACM00 BNA/ACCA Compliance Manual, Interim Index
Update

I need to look at filenames coming into a directory, see what their
“file prefix” is, based on column 1 above, and then notate in a separate
variable the contents of column 2. I’m needing to ascertain what the
company accounting code is for files being sent to our printer.

Should I do this with just an array, or could I use YAML?

Well, YAML is a serialisation format so it is really not
something you would use as runtime data storage. An Array
or a Hash would certainly work.

If I misunderstood you and you were looking for a file format,
then YAML would work great. When you YAML.load the file, you
will get your Array/Hash and you can also YAML.dump data into
the file. The file format would look like this:

acmcp7p

AACM00

Affirmative # …

…

Or this:

prefix: acmcp7p
code: AACM00
title: Affirmative #…

prefix: # …

…

Thanks a lot,
Peter

Thanks, Eero.
No, I wasn’t looking for a file format. I just need to make an
association between the items in the first column with their
corresponding entry in the second column. I guess a hash makes more
sense. I’d just read a bit about YAML, and, I was intrigued with the
fact that it’s simply text based and I could “parse” to it in some way.
At least, that’s how I understand it.

I’ve got filenames coming into a directory. I need to look at the first
7 characters of those filenames and determine what the “charge” code is
for the publication that’s publishing that file. This is all for our
company accounting system.

Thanks again.

bodikp · August 11, 2006, 1:57pm

Well, YAML is a serialisation format so it is really not
something you would use as runtime data storage. An Array
or a Hash would certainly work.

If I misunderstood you and you were looking for a file format,
then YAML would work great. When you YAML.load the file, you
will get your Array/Hash and you can also YAML.dump data into
the file. The file format would look like this:

acmcp7p

AACM00

Affirmative # …

…

Or this:

prefix: acmcp7p
code: AACM00
title: Affirmative #…

prefix: # …

…

Thanks a lot,
Peter

Thanks, Eero.
No, I wasn’t looking for a file format. I just need to make an
association between the items in the first column with their
corresponding entry in the second column. I guess a hash makes more
sense. I’d just read a bit about YAML, and, I was intrigued with the
fact that it’s simply text based and I could “parse” to it in some way.
At least, that’s how I understand it.

I’ve got filenames coming into a directory. I need to look at the first
7 characters of those filenames and determine what the “charge” code is
for the publication that’s publishing that file. This is all for our
company accounting system.

Right, OK. Then you want a Hash of file_prefix => service_code
mappings, let us say:

maps = {“aacmc7p” => “AACM00”, “acmp059p” => “BACM00”, …}
first_seven = read_first_seven_from filename
puts maps[first_seven]

That will, for a file starting with ‘aacmc7p’, produce ‘AACM00’.

Regarding YAML, it is a serialisation format. If it helps, you
may consider it in some respects a replacement of XML and it
would be used for similar purposes. In your case, you would use
it to store the mappings (instead of using the spreadsheet you
would use YAML) and then load them into a runtime data structure
such as an Array or a Hash. The data format for it would look as
I posted above and the automated parsing of it (YAML.load/YAML.dump)
would produce the Array/Hash data structures notated in the second post.

See http://yaml4r.sourceforge.net

Thanks again.

Yes, I’ve been reading up more on hashes in my “pickax” book. I was
intrigued with YAML because I thought that it would enable me to
populate a simple table with simple text that could easily be updated
when needed. And, because I’m talking about 174 entries here, it just
seemed more graceful than a hash or an array inside a script. I thought
that I could access a separate YAML table file within my script that
would be accessed in the same way I’d access a hash, through an
association of one entry with another entry in the same “row.”

But, your suggestions are perfect. I’m going to do the hash. Thank you
very much for your help, Eero.

bodikp · August 10, 2006, 8:20pm

Peter B. wrote:

Eero S. wrote:

Peter B. wrote:

I have a spreadsheet with 3 columns in it, and 174 rows. I need to
cross-reference the items in one column with their corresponding entry
in the other column. Here is a sample with the first 3 entries in this
table.

File Prefix Service Code Publication Title
aacmc7p AACM00 Affirmative Action Compliance Manual
acm059p BACM00 BNA/ACCA Compliance Manual
acm061p BACM00 BNA/ACCA Compliance Manual, Interim Index
Update

I need to look at filenames coming into a directory, see what their
“file prefix” is, based on column 1 above, and then notate in a separate
variable the contents of column 2. I’m needing to ascertain what the
company accounting code is for files being sent to our printer.

Should I do this with just an array, or could I use YAML?

Well, YAML is a serialisation format so it is really not
something you would use as runtime data storage. An Array
or a Hash would certainly work.

If I misunderstood you and you were looking for a file format,
then YAML would work great. When you YAML.load the file, you
will get your Array/Hash and you can also YAML.dump data into
the file. The file format would look like this:

acmcp7p

AACM00

Affirmative # …

…

Or this:

prefix: acmcp7p
code: AACM00
title: Affirmative #…

prefix: # …

…

Thanks a lot,
Peter

Thanks, Eero.
No, I wasn’t looking for a file format. I just need to make an
association between the items in the first column with their
corresponding entry in the second column. I guess a hash makes more
sense. I’d just read a bit about YAML, and, I was intrigued with the
fact that it’s simply text based and I could “parse” to it in some way.
At least, that’s how I understand it.

I’ve got filenames coming into a directory. I need to look at the first
7 characters of those filenames and determine what the “charge” code is
for the publication that’s publishing that file. This is all for our
company accounting system.

Right, OK. Then you want a Hash of file_prefix => service_code
mappings, let us say:

maps = {“aacmc7p” => “AACM00”, “acmp059p” => “BACM00”, …}
first_seven = read_first_seven_from filename
puts maps[first_seven]

That will, for a file starting with ‘aacmc7p’, produce ‘AACM00’.

Regarding YAML, it is a serialisation format. If it helps, you
may consider it in some respects a replacement of XML and it
would be used for similar purposes. In your case, you would use
it to store the mappings (instead of using the spreadsheet you
would use YAML) and then load them into a runtime data structure
such as an Array or a Hash. The data format for it would look as
I posted above and the automated parsing of it (YAML.load/YAML.dump)
would produce the Array/Hash data structures notated in the second post.

See http://yaml4r.sourceforge.net

Thanks again.