Downcase part of a string

On 10/22/06, Wilson B. [email protected] wrote:

The problem is that proper upcasing and downcasing of characters is
locale-dependent, not just encoding or language-dependent.

As examples, he mentioned that the uppercase version of accented
characters varies from area to area in France.

No, not depending on jurisdiction in France. In French French, one
would capitalize être as Etre. In Canadian French, one would
capitalize it as Être.

Also, in Turkish, there are four different cases of ‘i’, not just two… and which is
correct depends on the jurisdiction.

Not quite. There are two different ‘i’ letters: one with a dot, one
without. One is capitalized with a dot and one is capitalized without
the dot.

Also, the German eszet (ß, as in Schloß) would be capitalized as
SCHLOSS, but downcasing that would be schloss, not necessarily schloß.
(Actually, and the Germans here will correct me on this I’m sure, I
think it would always be Schloss or Schloß becaus the leading S would
not be lowercased in proper German. Looking at some German webpages
suggests so.)

Determining the locale in a correct way is really, really hard. Tim
Bray says it’s basically impossible. Also, all of these rules make
any decent upcase/downcase function ruinously slow.

Not impossible, just fraught with errors and performance issues. One
would not only have to have the locale lookup stuff, but one would
have to do statistical analysis to get better than mostly wrong with
anything but English. :wink:

-austin

On 10/25/06, Hal F. [email protected] wrote:

That’s the way I remember it – he said that a lowercase accented
character was sometimes uppercased differently, and it varied
“from district to district.”

The wording was actually “jurisdiction to jurisdiction.” This actually
matters, because it really deals with major jurisdictions, like Québec
and Vietnam and Surinam and Algeria and?

Earlier tonight I think he mentioned Quebec (but with a proper accent
that I don’t know how to type).

Look at AllCaps, Hal. Then you’d do Ctrl, ‘, e (three separate
keypresses, so not Ctrl-’) to get é. Or do like me: get a Mac
(Option-e, e). :wink:

I wouldn’t be surprised if the French sometimes sneered a little at
the French spoken in Quebec, the way (sometimes) Brits make fun of
Americans, or Spanish (or Colombians) make fun of Mexicans.

They do. Sometimes it’s absolutely awful. Although, it’s much more
like how (American) Northerners pick on (American) Southern drawls. If
you have a chance to listen to a Québecois French speaker, listen
carefully. You’ll hear a bit of a twang like you would in the American
south. Funnily enough, it’s for similar reasons. The American north
was settled first and most often, and had the biggest blending of
language and dialect. The south, on the other hand, was settled a
little later and ended up being a bit more sparsely populated and with
fewer non-English speakers. So the old English speaking habits hung on
a little longer and were a bit more isolated.

The Québecois were separated from France at the time of the Seven
Years’ War?what Americans would call the French and Indian War. The
French immigrants of the time were taught in an école system that was
headed by the priests, who by and large spoke court French. When
France lost all of Canada to England, the priests and nobles went back
to France, rather than subjecting themselves to English rule. A mere
thirty years later, most of them lost their heads and the French
revolutionaries established schools?that taught the French that they
knew. In other words, the French spoken after the revolution was
mostly the French of the streets and of the salons, not royal French.
From that point forward, the French spoken by the European French and
the Québecois diverged some.

-austin

Tim B. wrote:

On Oct 24, 2006, at 9:27 PM, Hal F. wrote:

Tim: Read my ch 4 when you can and give me your opinion. :wink:

Hal: your book is too thick, it hurts my wrists. But I’ll try. -Tim

Ha… but chapter 4 is rather thin.

Maybe you can attack someone else’s copy with an X-acto knife…
someone who doesn’t want to internationalize anyhow.

Hal

On Oct 24, 2006, at 9:27 PM, Hal F. wrote:

Tim: Read my ch 4 when you can and give me your opinion. :wink:

Hal: your book is too thick, it hurts my wrists. But I’ll try. -Tim

Austin Z. wrote:

On 10/25/06, Hal F. [email protected] wrote:

That’s the way I remember it – he said that a lowercase accented
character was sometimes uppercased differently, and it varied
“from district to district.”

The wording was actually “jurisdiction to jurisdiction.” This actually
matters, because it really deals with major jurisdictions, like Québec
and Vietnam and Surinam and Algeria and?

My memory may be faulty, but I really thought he said “district.”
Not that it matters really. Was it recorded?

[snip other interesting stuff, which reminds me a little of Wayne’s
World
where Alice Cooper gives the history lesson]

Hal

On 10/25/06, Hal F. [email protected] wrote:

My memory may be faulty, but I really thought he said “district.”
Not that it matters really. Was it recorded?

Probably. If I’m wrong, I’ll just have to owe you the beverage of your
choice when I see you next. :wink:

[snip other interesting stuff, which reminds me a little of Wayne’s
World
where Alice Cooper gives the history lesson]

Sorry. I have an interest in linguistics in general (did my final Uni
project on linguistics), history, and am an immigrant to my current
country. It also doesn’t hurt that my fiancée is a Francophone (not
from Québec, but with family origins in Mauritius, lost in the same
treaty terms as Québec).

-austin

On 10/25/06, Hal F. [email protected] wrote:

I wouldn’t be surprised if the French sometimes sneered a little at
the French spoken in Quebec, the way (sometimes) Brits make fun of
Americans, or Spanish (or Colombians) make fun of Mexicans.

And, I understand that there are some Québeçois who maintain that they
speak a purer form of French than the European French do. It’s kind
of like the theory that Elizabethan English is still spoken on some of
the islands off the coasts of Virginia and North Carolina.

Actually both French and Canadian French have evolved in different
ways, much as have British and American English.

Some Québeçois think of themselves as French, hence the “European
French” above. I have a good friend, a native niçoise, who was quite
amused when she was in Canada and some one said to her, “so you’re
French, from France!” to which her unspoken reaction was “Where
else?”


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On 10/25/06, Rick DeNatale [email protected] wrote:

On 10/25/06, Hal F. [email protected] wrote:

I wouldn’t be surprised if the French sometimes sneered a little at
the French spoken in Quebec, the way (sometimes) Brits make fun of
Americans, or Spanish (or Colombians) make fun of Mexicans.
And, I understand that there are some Québeçois who maintain that they

This spelling is incorrect. It’s Québecois (kay-beh-kwah), not
Québeçois (kay-beh-swah).

speak a purer form of French than the European French do. It’s kind
of like the theory that Elizabethan English is still spoken on some of
the islands off the coasts of Virginia and North Carolina.

“Purer” isn’t the right term; “older” is. Said belief is mostly true;
Québec was isolated from the linguistic shifts that France experienced
in the late eighteenth and throughout the nineteenth century. And
there probably wouldn’t be any Elizabethan English speakers, but some
American English dialects are closer to Georgian (18th century)
English than they are to modern British English; I suspect that
American English dialects in general are closer to 18th century
English than either Canadian or modern British English. I haven’t
studied linguistics in a long time, but it is apparently possible to
measure linguistic drift in dialects.

Actually both French and Canadian French have evolved in different
ways, much as have British and American English.

Certainly

Some Québeçois think of themselves as French, hence the “European
French” above. I have a good friend, a native niçoise, who was quite
amused when she was in Canada and some one said to her, “so you’re
French, from France!” to which her unspoken reaction was “Where
else?”

Yes, the French (from France) are famous for being short-sighted that
way, only remembering the rest of La Francophonie when it’s
convenient. Even in Europe, France isn’t the only source of native
French speakers (Belgium, Switzerland, Luxembourg, and Romania[?!]),
and then you’ve got a lot of different former French colonies:
Mauritius (officially English, but everyone speaks French regardless
of what the Government says), Québec, New Brunswick (and in Canada,
the Métis of Manitoba matter, too), Togo, Haiti, Rwanda, and various
other places, too.

-austin

On 10/25/06, Bira [email protected] wrote:

On 10/25/06, Austin Z. [email protected] wrote:

Yes, the French (from France) are famous for being short-sighted that
way, only remembering the rest of La Francophonie when it’s
convenient.
Doesn’t ‘French’ mean “someone who was born in France”, rather than
“someone who speaks French” (‘francophone’) ?

It can also mean “someone of French ancestry.” Québecois consider
themselves French, as I understand it.

-austin

On 10/25/06, Austin Z. [email protected] wrote:

Yes, the French (from France) are famous for being short-sighted that
way, only remembering the rest of La Francophonie when it’s
convenient.

Doesn’t ‘French’ mean “someone who was born in France”, rather than
“someone who speaks French” (‘francophone’) ?

In France uppercase letters accents are optional, I do not think that
the
application of the accents varies by regions. (Dis-le si je dis une
gaffe
Franc, je ne suis pas Français)
I do not know about Québec.
So Tim is wrong, in a literal sense, sorry, but I guess that the point
he
wanted to make is very valid, how would you downcase “TACHE”, as “tache”
=>
spot or “tâche” => task?

Cheers
Robert


The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all progress
depends on the unreasonable man.

  • George Bernard Shaw

On 10/25/06, Austin Z. [email protected] wrote:

wanted to make is very valid, how would you downcase “TACHE”, as “tache”
=>
spot or “tâche” => task?

Tim is not wrong. Someone’s remembering of what Tim said is wrong. In
speaking with him after, it was very clear that he was referring to
larger jurisdictions, such as the differentiation between France and
Canada on uppercase accents.

O.K. anyway that is just off-topic(1) it was not clear and I just wanted
to
be clear about what is French French(2), there is a jurisdiction and I
should look it after.
I made to much noise about right/wrong, the important thing is the point
he
made
and hopefully my example has made that clearer for one or two of the
three
who still read my posts, thx Austin :wink:

-austin

Cheers
Robert

(1) I’d love an Off List link for the Québec rules though if you would
not
mind. Thx. R.
(2) As I learnt and read in some books, which is not a complete proof,
but
often foreigners have to learn this
stuff better than native speakers.

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all progress
depends on the unreasonable man.

  • George Bernard Shaw

On 10/25/06, Robert D. [email protected] wrote:

In France uppercase letters accents are optional, I do not think that the
application of the accents varies by regions. (Dis-le si je dis une gaffe
Franc, je ne suis pas Français)

I’m pretty sure that whether or not accented capital letters are
actually accented in French has more to do with typography and
technology than geography or jurisdiction.

I’ve heard that accents were dropped on capital letters because
accented capitals weren’t commonly available on typewriters. I also
understand that it’s falling out of common practice, probably due to
the proliferation of digital fonts.

I just checked with the wikipedia and it seems to confirm this FWIW:


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On 10/25/06, Robert D. [email protected] wrote:

In France uppercase letters accents are optional, I do not think that the
application of the accents varies by regions. (Dis-le si je dis une gaffe
Franc, je ne suis pas Français)
I do not know about Québec.
So Tim is wrong, in a literal sense, sorry, but I guess that the point he
wanted to make is very valid, how would you downcase “TACHE”, as “tache” =>
spot or “tâche” => task?

Tim is not wrong. Someone’s remembering of what Tim said is wrong. In
speaking with him after, it was very clear that he was referring to
larger jurisdictions, such as the differentiation between France and
Canada on uppercase accents.

Again: I don’t agree with Tim on some of his positions, but this is
not a place he made a mistake.

-austin

Le 26 octobre 2006 à 02:04, Rick DeNatale a écrit :

I’ve heard that accents were dropped on capital letters because
accented capitals weren’t commonly available on typewriters.

That was my understanding too.

In any case, the Académie française insists that the capital letters
must be accented, if only because some accents can change the meaning of
a word in french.

Fred

On 10/26/06, F. Senault [email protected] wrote:

Le 26 octobre 2006 à 02:04, Rick DeNatale a écrit :

I’ve heard that accents were dropped on capital letters because
accented capitals weren’t commonly available on typewriters.

That was my understanding too.

In any case, the Académie française insists that the capital letters
must be accented, if only because some accents can change the meaning of
a word in french.

Bien sur. I already communicated this privately to Robert D…

En Frinch, the accints ari as moch an aspict uf tha spilleng as which
bisac vowill ur cansonint thiy ure atiched to.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Le 27 octobre 2006 à 04:33, John W. Kennedy a écrit :

designed with the extra room on top, and no one seems to have thought of
(or rejected on aesthetic grounds) shrinking the letters a bit to fit
the accent in.

Colour me very surprised. I’d add that I could find more references to
my hypothesis than yours on the web, but not being a specialist, I’ll
just stay dubitative.

Now, we’re still way off topic for this, so, if you have some pointers
to add or wish to continue the conversation, my email address is valid
(if heavily protected against spam).

Fred

F. Senault wrote:

designed with the extra room on top, and no one seems to have thought of
(or rejected on aesthetic grounds) shrinking the letters a bit to fit
the accent in.

Colour me very surprised. I’d add that I could find more references to
my hypothesis than yours on the web, but not being a specialist, I’ll
just stay dubitative.

Now, we’re still way off topic for this, so, if you have some pointers
to add or wish to continue the conversation, my email address is valid
(if heavily protected against spam).

I’ve spent a considerable amount of time in the last couple of years
working with the first edition of the 1798 play, “André: a Tragedy in
Five Acts”. 'Nuff said.

On 11/4/06, John W. Kennedy [email protected] wrote:

No, it goes back at least to the 18th century, to my certain knowledge,
Now, we’re still way off topic for this, so, if you have some pointers
to add or wish to continue the conversation, my email address is valid
(if heavily protected against spam).

I’ve spent a considerable amount of time in the last couple of years
working with the first edition of the 1798 play, “André: a Tragedy in
Five Acts”. 'Nuff said.

That predates Ada!!!

John W. Kennedy
“The blind rulers of Logres
Nourished the land on a fallacy of rational virtue.”
– Charles Williams. “Taliessin through Logres: Prelude”


The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all progress
depends on the unreasonable man.

  • George Bernard Shaw

F. Senault wrote:

Le 26 octobre 2006 à 02:04, Rick DeNatale a écrit :

I’ve heard that accents were dropped on capital letters because
accented capitals weren’t commonly available on typewriters.

That was my understanding too.

No, it goes back at least to the 18th century, to my certain knowledge,
long before typewriters were invented. The typefaces just weren’t
designed with the extra room on top, and no one seems to have thought of
(or rejected on aesthetic grounds) shrinking the letters a bit to fit
the accent in.