How to deal with gender

Peter H. wrote:

To be honest I would go with

user_id :integer
title :string
forename :string
midname :string
surname :string
suffixes :string
gender :string

and treat midname as a space separated list of names (although this
wont handle names like ‘st clair’ correctly).

Storing multiple values in a single column break First Normal Form (1NF)
and is always a bad idea IMHO. If fact having multiple fields for the
parts of names also technically breaks 1NF because it is a repeating
group. name1, name2, name3, etc. Naming them as above actually obscures
this fact.

For a fully robust solution we need to eliminate the repeating group:

Baby
id :integer
gender :string

Methods:
def full_name
@components = NameComponent.order(:position)
@components.join(’ ')
end

def name_component_at_index(index)
NameComponent.where(:position => index)
end

def append_name_component


end

def insert_name_component_at_index(index)


end

NameComponent
id :integer
baby_id :integer
position :integer
name :string

Although I showed some example convenience methods above, one could use
something like acts_as_list (or of the derivatives of that) for managing
the list of name components.

I like Peter H.'s answer the best, especially the seperating
midname by spaces - doh why didn’t I think of that…really good to
bounce this around about, although I’ll probably not bother with the
suffixes. Thanks chaps.

Yours sincerely,

bb aka Sir William Bobby Seadorf Lawrence Elliott Fred-Smith the 2nd

On Jul 29, 12:25 pm, bingo bob [email protected] wrote:

Person
belongs_to: gender

Gender
has_many: people

Sorry if this is such an obvious question just want to do it right.

This is a great example of a case where normalization is doing it
wrong. Unless you’re somehow expecting additional genders to be added,
this sort of a solution is overkill and pretty much only nets you
additional database overhead. Using an attribute should suffice…

–Matt J.

If it is a “baby name chooser” application then each “name” needs to
know
whether the name can be “male”, “female” or “both.” I would create
male?
and female? as separate fields that can be either true or false. This
simplistic approach to design is superior to creating a both option in
one
field.

Peter H. wrote:

and treat midname as a space separated list of names (although this
wont handle names like ‘st clair’ correctly).

Comma separated, then?


Alan G. - [email protected] - http://twitter.com/bigeasy

On 29 July 2010 17:25, bingo bob [email protected] wrote:

What’s the best way of to deal with gender.

i.e. a Person is either Male or Female (generally and fot my purposes
I’ll stick with that rule).

I don’t know if it’s annoying anyone else, but I’ve been following
this thread with interest, but every time I read “gender” it grates.
The word you’re after is “sex”.
“Male” and “female” are sexes. “Masculine”, “feminine”, “neutral”,
etc, are genders…

I appreciate that it seems to have become a convention to use the word
“gender” as a “polite” reference to the discrimination between innies
and outies… but for the vast, overwhelming majority there are only
two choices; either being male or female, and that’s your sex (whether
it changes over the duration of your life is a different debate :slight_smile:

Now, I’d store that in a text field in the record as either the full
word, or as “m” or “f”, and load the DB and the model up with
constraints and validations, and not worry about having a Sex model,
with associated “sexes” table. Move along to other areas of your
application.

Regarding the multiple names issue; in response to the suggestion of
having multiple name midname fields, I’d ask “why?” What’s the
ultimate purpose of storing those names? Generally it’s to just be
able to
pull them all out of the DB to put on a letterhead when needed, and in
that event, you’re optimising too far to give them each a field.

For, again, the vast majority of cases, the following will suffice:
first_name
middle_names
last_name
(although that assumes the WASP model, which the majority of the world
has been forced to adjust it’s naming models to, arguably wrongly as
that may have been)

If you want to track whether the name “Michael P.” should be
addressed formally as “Mr Pavling” or as “Michael-san”, or whatever,
you need to have some way of tracking for each person whether the
first_name or the last_name is the “family” name, or I’d consider a
“salutation” field which stored that value (even though it duplicates
values, breaking normalisation a little).

I can’t help but think of the commercial (Visa I think) playing during
the world cup where the child had all the names of players on a
winning squad from an earlier world cup…

Michael P. wrote:

On 29 July 2010 17:25, bingo bob [email protected] wrote:

What’s the best way of to deal with gender.

i.e. a Person is either Male or Female (generally and fot my purposes
I’ll stick with that rule).

I don’t know if it’s annoying anyone else, but I’ve been following
this thread with interest, but every time I read “gender” it grates.
The word you’re after is “sex”.
“Male” and “female” are sexes. “Masculine”, “feminine”, “neutral”,
etc, are genders…

I appreciate that it seems to have become a convention to use the word
“gender” as a “polite” reference to the discrimination between innies
and outies…

Not exactly. Sex is biological, gender is psychosocial. There are
people of male sex and feminine gender, and vice versa.

but for the vast, overwhelming majority there are only
two choices; either being male or female, and that’s your sex (whether
it changes over the duration of your life is a different debate :slight_smile:

No. Your sex doesn’t change over the duration of your life if you’re a
human (at least, not without surgery, but that’s a different debate).
Your gender may. I know genderqueer people who don’t consistently
identify with one gender or another, despite whatever genitalia God gave
them.

Now, I’d store that in a text field in the record as either the full
word, or as “m” or “f”, and load the DB and the model up with
constraints and validations, and not worry about having a Sex model,
with associated “sexes” table. Move along to other areas of your
application.

Agreed if we’re talking about sexes. For genders, there are more than
two categories, and the list might change as other gender categories are
recognized (yes, the issue is more complex than you’d think), so I’d
probably use a model or at least an enumeration for that.

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

interestingly enough - that’s exactly what I’ve done.

gone with…

forename:string
midnames:string
surname:string

midnames seperated by spaces basically, in the vast majority of cases
I’d expect 0, 1 or more rarely 2 midnames; pragmatically it just seemed
like the right thing to do in this case.

bingo bob wrote:

interestingly enough - that’s exactly what I’ve done.

gone with…

forename:string
midnames:string
surname:string

midnames seperated by spaces basically, in the vast majority of cases
I’d expect 0, 1 or more rarely 2 midnames; pragmatically it just seemed
like the right thing to do in this case.

What you’re doing is most likely going to “work” and will be easy to
implement, but your solution is still bad database design. You are
relying on a repeating group, multiple values in a single column.

It’s fine what you’re doing, as long as you realize that it is bad
design and you can live with that.

For more details see:

On Jul 30, 6:44 am, bingo bob [email protected] wrote:

midname1 :string
midname2 :string
midname3 :string
midname4 :string
midname5 :string

Deignating a particular cardinality just looks kinda wrong to me, tho
of course everyone else’s mileage is pretty much guaranteed to vary.

How about just midnames? Then you won’t run into problems with
typical medieval/Renaissance Spanish names (where they tacked on long
lineages), or that guy whose name starts with 128 repetitions of
“Shri”. :slight_smile: Or possibly “midname” and “more_midnames”? Or you could
go the other way and separate the name only into family_name and
given_names. Question is, is it ever important to have a given
specific one of the (given) names?

-Dave


Specialization is for insects. -RAH | Have Pun, Will Babble! -me
Programming Blog: http://codosaur.us | Work: http://davearonson.com
Leadership Blog: http://dare2xl.com | Play: http://davearonson.net

Dave A. wrote:

On Wed, Aug 4, 2010 at 10:18, Robert W. [email protected]
wrote:

For more details see:
First normal form - Wikipedia

I had considered suggesting going with a complete separate table of
names, containing a person_id (or in this use-case, fullname_id),
sequence number, and string, but that seemed like overkill. It would
be “normal”, but IMHO HAGNI. :slight_smile:

I definitely agree that there are cases where breaking normalization for
other gains is desirable. I just don’t completely agree that this is one
of those cases. In this case following out to 3NF is beneficial. It
would provide for a more flexible and more elegant design. It will also
simplify the design for asking more “interesting” questions like, “Give
me a list of all names that have Dale as a middle name.”

Maybe questions like that aren’t a concern now, but breaking normal
forms in this case will make questions like that more difficult (and
complex) to ask later if the need arises. Why “build in” known
limitations when there is no known reasons for do so?

I understand the desire for simple design, but not at the cost of good
design. Not when there’s no clear reason to avoid the better design.

Robert W. wrote:

I definitely agree that there are cases where breaking normalization for
other gains is desirable. I just don’t completely agree that this is one
of those cases. In this case following out to 3NF is beneficial. It
would provide for a more flexible and more elegant design. It will also
simplify the design for asking more “interesting” questions like, “Give
me a list of all names that have Dale as a middle name.”

Maybe questions like that aren’t a concern now, but breaking normal
forms in this case will make questions like that more difficult (and
complex) to ask later if the need arises. Why “build in” known
limitations when there is no known reasons for do so?

I understand the desire for simple design, but not at the cost of good
design. Not when there’s no clear reason to avoid the better design.

I tend to normalise as often as possible, and I do agree with you
Robert. But http://c2.com/cgi/wiki?YouArentGonnaNeedIt also raises a
very good point.

Pale H. wrote:

I tend to normalise as often as possible, and I do agree with you
Robert. But http://c2.com/cgi/wiki?YouArentGonnaNeedIt also raises a
very good point.

Given that the primary reason for flattening data, that could otherwise
be normalized, is for performance optimization reasons,
http://c2.com/cgi/wiki?PrematureOptimization is the balance to YAGNI.

On Wed, Aug 4, 2010 at 10:18, Robert W. [email protected]
wrote:

For more details see:
First normal form - Wikipedia

I had considered suggesting going with a complete separate table of
names, containing a person_id (or in this use-case, fullname_id),
sequence number, and string, but that seemed like overkill. It would
be “normal”, but IMHO HAGNI. :slight_smile:

-Dave


Specialization is for insects. -RAH | Have Pun, Will Babble! -me
Programming Blog: http://codosaur.us | Work: http://davearonson.com
Leadership Blog: http://dare2xl.com | Play: http://davearonson.net

Robert W. wrote:

Pale H. wrote:

I tend to normalise as often as possible, and I do agree with you
Robert. But http://c2.com/cgi/wiki?YouArentGonnaNeedIt also raises a
very good point.

Given that the primary reason for flattening data, that could otherwise
be normalized, is for performance optimization reasons,
http://c2.com/cgi/wiki?PrematureOptimization is the balance to YAGNI.

Tautological, is this matter. Normalising the database is indeed,
almost always the correct methodology.