Get a portion of a utf8 encoded string

hi all.

ruby 1.8.5 (2006-08-25) [i386-linux]

puts ‘hällo’
hällo
=> nil

puts ‘hällo’[0, 2]
hâ–’
=> nil

puts ‘hällo’[0, 3]
hä
=> nil

require ‘iconv’
=> true

puts ‘hällo’[0, 3]
hä
=> nil

puts ‘hällo’[0, 2]
hâ–’
=> nil

The ä character is probably 2 bytes long, so puts ‘hällo’[1, 1] returns
only a “pice” of the ä character. Is there a way to make this work?
‘hällo’[0, 2] should return -> hä
and
‘hällo’[0, 3] should return -> häl

thanks

Hi,

2008/9/16 Marco [email protected]:

puts ‘hällo’[0, 3]

You can do it using regular expression :

‘hällo’.split(//u)[0, 2].to_s → hä

Regards,

Park H.

Heesob P. wrote:

You can do it using regular expression :

‘hällo’.split(//u)[0, 2].to_s -> hä

thanks. even if ‘hällo’[0, 2] looks better, this works.

From: S2 [mailto:[email protected]]

thanks. even if ‘hällo’[0, 2] looks better

upgrade?

[RUBY_VERSION, RUBY_PATCHLEVEL, RUBY_REVISION, RUBY_PLATFORM]
=> [“1.8.7”, 5000, 0, “i686-linux”]

puts ‘hällo’
hällo
=> nil

puts ‘hällo’[0, 2]
hä
=> nil

puts ‘hällo’[0, 3]
häl
=> nil

Heesob P. wrote:

You can do it using regular expression :

‘hällo’.split(//u)[0, 2].to_s -> hä

Why make it convoluted when ruby makes it so easy to use regexps:

puts ‘hällo’[/.{2}/um]
hä

On Tue, Sep 16, 2008 at 4:50 AM, Peña, Botp [email protected] wrote:

From: S2 [mailto:[email protected]]

thanks. even if ‘hällo’[0, 2] looks better

upgrade?

It concerns me that people are suggesting using various backwards
incompatible changes in Ruby 1.8.7

Though 1.8.7 may be a good fallback for 1.9 only libraries that may
make things work as expected, I hate the idea of writing code that
works on 1.8.7 but not other versions of Ruby 1.8. At this point, I
feel like 1.8.6 is still the ‘real’ Ruby 1.8, anyway.

-greg

On Wed, Sep 17, 2008 at 9:36 PM, Peña, Botp [email protected]

cmon, greg, what could be better than

‘hällo’[0, 2]

?

the change is good.

The change is absolutely good! I did a training session at Lone Star
Ruby Conference that sung its praises.

What isn’t good is to say “I’ve not yet updated my code for Ruby 1.9”,
so it works on Ruby 1.8.7 only.

My point is that if people really want to use Ruby 1.9 features for
anything but experimentation, they’d do more good by actually using
1.9, not an intermediate release that left most people confused.

If you are writing Ruby 1.8.7 specific code, your code may not run on
Ruby 1.9, for example, the code above will blow up on 1.9 unless the
encoding is properly set.

And your code definitely won’t work on Ruby 1.8.x aside from 1.8.7

So if you’re okay locking to a single point release, that’s fine. But
I think it’d help ruby-core a lot more for you to use Ruby 1.9 and
help them iron out issues, and it’d help the Ruby community a lot more
for you to choose whether you are supporting 1.8.x, 1.9.x or both, but
not something that is neither (1.8.7)

-greg

From: Gregory B. [mailto:[email protected]]

On Tue, Sep 16, 2008 at 4:50 AM, Peña, Botp

[email protected] wrote:

> From: S2 [mailto:[email protected]]

> # thanks. even if ‘hällo’[0, 2] looks better

> upgrade?

It concerns me that people are suggesting using various backwards

incompatible changes in Ruby 1.8.7

Though 1.8.7 may be a good fallback for 1.9 only libraries that may

make things work as expected, I hate the idea of writing code that

works on 1.8.7 but not other versions of Ruby 1.8. At this point, I

feel like 1.8.6 is still the ‘real’ Ruby 1.8, anyway.

cmon, greg, what could be better than

‘hällo’[0, 2]

?

the change is good.

kind regards -botp