Yes, the LANG is affecting the result in irb, but not ruby.
$ irb -v
irb 0.9.5(05/04/13)
Whether the irb behavior is “correct” or anomalous is probably a
question for the maintainers to debate. The man page for ctype(3)
(on my Mac OS X 10.4.8) indicates that the macros are supposed to be
based on the locale and my copy of the pickaxe (p.71) says that the
character classes are based on the ctype macros of the same name.
However, a quick C program shows effectively the same behavior as
ruby (i.e., only the [0-9A-Za-z] satisfy isalnum() even for nl_NL).
I’m now more curious as to how irb is finding the character classes.
based on the locale and my copy of the pickaxe (p.71) says that the
character classes are based on the ctype macros of the same name.
However, a quick C program shows effectively the same behavior as
ruby (i.e., only the [0-9A-Za-z] satisfy isalnum() even for nl_NL).
I’m now more curious as to how irb is finding the character classes.
It turns out that the poster who mentioned possible interference from
the readline(3) library was right.
This is very unexpected and undesirable behaviour and, as such,
probably qualifies as a bug.
Yeah, seems so. Unless it’s documented behavior.
Interestingly, adding “require ‘readline’” to the stand-alone script
does not introduce this behaviour, so it must be something to do with
the initialisation that irb does.
It’s really strange as both print the same output. How about doing this
just to be sure that both strings contain the same sequence of bytes:
On Thu 15 Feb 2007 at 12:39:21 +0900, Rob B. wrote:
However, a quick C program shows effectively the same behavior as
ruby (i.e., only the [0-9A-Za-z] satisfy isalnum() even for nl_NL).
I’m now more curious as to how irb is finding the character classes.
It turns out that the poster who mentioned possible interference from
the readline(3) library was right.
This is very unexpected and undesirable behaviour and, as such,
probably qualifies as a bug.
Interestingly, adding “require ‘readline’” to the stand-alone script
does not introduce this behaviour, so it must be something to do with
the initialisation that irb does.
On Fri 16 Feb 2007 at 00:40:08 +0900, Robert K. wrote:
$ irb --noreadline
Interestingly, adding “require ‘readline’” to the stand-alone script
does not introduce this behaviour, so it must be something to do with
the initialisation that irb does.
It’s really strange as both print the same output.
You mean that both of them show foo to contain the same string of bytes?
How about doing this
just to be sure that both strings contain the same sequence of bytes: