Note that because I am traveling tomorrow, I’ve posted this week’s
quiz a bit early.
The three rules of Ruby Q. 2:
Please do not post any solutions or spoiler discussion for this
quiz until 48 hours have passed from the time on this message.
Support Ruby Q. 2 by submitting ideas as often as you can! (A
permanent, new website is in the works for Ruby Q. 2. Until then,
please visit the temporary website at
Suggestion: A [QUIZ] in the subject of emails about the problem
helps everyone on Ruby T. follow the discussion. Please reply to
the original quiz message, if you can.
There are numerous themes we have encountered across all of the past
problems, but there are a few that come back time and time again, albeit
sometimes in disguise. I can recall a number of quizzes that were best,
easily, approached using pattern matching. Data searching is also a
theme, most often accessing the large, well-known databases of
This week we’re going to explore another large database that you might
not be familiar with: the USDA’s Nutrient Database. You can find out
this database at:
The current database (SR20) can be downloaded from:
I recommend getting the abbreviated, ASCII download (a flat-file
though those who want to experience the full brunt of the relational
are welcome to download that. I will focus on the abbreviated version,
it will serve our needs for this and future quizzes.
Opening the archive for the abbreviated database, you’ll find two files:
- ABBREV.txt: this is the ASCII database
- SR20_doc.pdf: a document describing the format and content of the
(Note that SR20 now also contains a patch to the database. For the
this quiz, I am not concerned whether you apply that patch or not. If
don’t want to worry about the patch, feel free to ignore it.)
The format of the database is fairly simple; the provided document
the abbreviated file format beginning on page 29. To summarize, each
is a single line and contains more than a few delimited fields. Fields
separated by carets (^), and text fields are surrounded by tildes
The file is sorted by the first field, the food’s Nutrient Databank
(NDB). Each line provides nutrient information for 100 grams of that
Your task is to provide a function that will search this nutrient
for a food and provide information about it.
def nutrient_report(food, weight=100) # print report to stdout end
Parameter food will be a string that is the food to locate. Keep in
that there may be multiple entries that will simply match (a la grep)
parameter provided. You should only report on one of these foods at this
which one to choose is up to you. You may want to consider a metric such
the Levenshtein Distance
while comparing food names against the search string.
Parameter weight is the weight to measure in grams, defaulting to
(Recall that the nutrient information of each record of the database is
based upon 100 grams.) Your report should output numerical information
corresponds to the weight requested. There is information in the
provided that explains how to adjust for weight.
The output you provide is mostly up to you, but should include as a
- Full food name (as found in the database, not the search string)
- Food weight (as provided to the function)
- Nutrient values for:
- Carbohydrates (the
- Fats (sum of the fields
A few more things to consider. First, the database contains information
over 7,500 food items. That may be a lot to search and do string
on. If you find your searches going very slowly, consider caching the
to a more search-efficient format.
Second, consider writing some tests with database integrity in mind. For
example, at a quick glance, it appears that all the food names are
in the database in full-caps. But if you base your search on this
you may miss at least one food (or perhaps more) in your search, as at
one food was entered into ABBREV.txt in mixed-case. There may be other
in the file, so consider doing a few sanity checks on the data file
diving into the heart of the quiz. (Feel free to post integrity test
to the mailing list before the waiting period is up.)
Third, and finally, part of the goal here is to make available another
large, interesting database for future Ruby Q. problems. There are
of opportunities available here… meal planning is just one example.
Keep this in mind while designing your solution: we want a firm
for searching this nutrient database so that future problems can focus
examining the results of the search.