Unexpected length of £ (pound) character?

Hi,

Today I came across an issue with a customer custom report which was out
by 1 char over 40 or so lines. At first thought I had incorrectly
limited the field length, however the problem is only present where
there is a ‘£’ char.

For example:
“1234”.length => 4
“1234£”.length => 6 (Expect 5)
“1234£6”.length => 7 (Expect 6)

Tested on:
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
and:
ruby 1.8.6 (2009-03-31 patchlevel 368) [x86_64-linux]

I could not find anything in google covering this (Perhaps my google-fu
needs work) which brought me here.

Is this expected functionality in ruby? It does not seem right in my
mind.

Thanks. :slight_smile:

On Fri, Jun 4, 2010 at 6:19 PM, Anthony Ss
[email protected] wrote:

For example:
“1234”.length => 4
“1234£”.length => 6 (Expect 5)
“1234£6”.length => 7 (Expect 6)

can’t help you there, but fyi

RUBY_VERSION
=> “1.9.2”
“1234”.length
=> 4
“1234£”.length
=> 5
“1234£6”.length
=> 6

kind regards -botp

Anthony Stenhouse wrote:

Hi,

Today I came across an issue with a customer custom report which was out
by 1 char over 40 or so lines. At first thought I had incorrectly
limited the field length, however the problem is only present where
there is a ‘£’ char.

For example:
“1234”.length => 4
“1234£”.length => 6 (Expect 5)
“1234£6”.length => 7 (Expect 6)

In UTF-8, “£” is two bytes, and ruby 1.8 gives you the number of bytes.

If you want to capture (say) the first 6 characters of the string, try
this:

a = “1234£6789”
=> “1234\302\2436789”

a =~ /\A(.{6})/u
=> 0

puts $1
1234£6
=> nil

This may be sufficient for simple wrapping functions. Or look at the
Iconv library.

Is this expected functionality in ruby? It does not seem right in my
mind.

ruby 1.9 works in characters. It brings with it enormous complexity,
pitfalls and inconsistencies. Pick your poison :slight_smile:

On 2010-06-04 06:19:09 -0400, Anthony Ss said:

Thanks. :slight_smile:
Do yourself a favor, friend, and read this excellent article on
Character Encoding:

Then, find out if the character encoding for your file and your
interpreter is the same. :stuck_out_tongue:

On Fri, Jun 4, 2010 at 6:47 PM, Brian C. [email protected]
wrote:

If you want to capture (say) the first 6 characters of the string, try

a = “1234£6789”
=> “1234\302\2436789”
a =~ /\A(.{6})/u
=> 0
puts $1
1234£6
=> nil

“1234£6789”[0…5]
=> “1234£6”

ruby 1.9 works in characters. It brings with it enormous complexity,
pitfalls and inconsistencies. Pick your poison :slight_smile:

All programmers are optimists. -Frederick Brooks, Jr.

:wink:

kind regards -botp