Hello,
Is there an easy way to chop (as in String#chop) a string that can
potentially contain UTF-8 in ruby 1.8? Or should I roll my own?
Thanks,
Ammar
Hello,
Is there an easy way to chop (as in String#chop) a string that can
potentially contain UTF-8 in ruby 1.8? Or should I roll my own?
Thanks,
Ammar
Ended up making my own. Posting it here for the benefit of others, and
maybe some feedback.
UTF-8 aware string chop · GitHub
Regards,
Ammar
On Nov 3, 2010, at 9:08 AM, Ammar A. wrote:
Is there an easy way to chop (as in String#chop) a string that can
potentially contain UTF-8 in ruby 1.8? Or should I roll my own?
Well, it should be this simple:
str.gsub(/.\z/mu, “”)
James Edward G. II
On Wed, Nov 3, 2010 at 5:57 PM, James Edward G. II
[email protected] wrote:
Well, it should be this simple:
str.gsub(/.\z/mu, “”)
On Wed, Nov 3, 2010 at 6:04 PM, Adam P. [email protected]
wrote:
s.gsub(/^(.+)./u) { $1 }
=> “one two thre”
Beautiful. Thank you both.
It was a god exercise for me, so I don’t necessarily feel that I
wasted 30 minutes of my life
By the way, the m options seems superfluous in James’ version. I get
the same results without it.
Thanks again,
Ammar
On Nov 3, 2010, at 11:33 AM, Ammar A. wrote:
Beautiful. Thank you both.
It was a god exercise for me, so I don’t necessarily feel that I
wasted 30 minutes of my lifeBy the way, the m options seems superfluous in James’ version. I get
the same results without it.
It’s not:
“\n”.sub(/.\z/u, “”)
=> “\n”“\n”.sub(/.\z/mu, “”)
=> “”
Using gsub() over sub() was a dumb mistake on my part though. sub() is
all you need, since it can only match once.
James Edward G. II
I was going to say
$KCODE=“U”
=> “U”s = “one two three”
=> “one two three”s.gsub(/^(.+)./u) { $1 }
=> “one two thre”
I guess I overthought it, huh!
On Wed, Nov 3, 2010 at 6:38 PM, James Edward G. II
[email protected] wrote:
Using gsub() over sub() was a dumb mistake on my part though. sub() is all you
need, since it can only match once.
Thanks for the clarification.
My method now looks like:
def chop_utf8(s)
return unless s
lead = s.sub(/.\z/mu, “”)
last = s.scan(/.\z/mu).first
last = ‘’ unless last
[lead, last]
end
Short and sweet.
Cheers,
Ammar
On Nov 3, 2010, at 11:56 AM, Ammar A. wrote:
My method now looks like:
def chop_utf8(s)
return unless slead = s.sub(/.\z/mu, “”)
last = s.scan(/.\z/mu).first
last = ‘’ unless last
The two lines above can be replaced with the more efficient:
last = s[/.\z/mu] || ‘’
[lead, last]
end
James Edward G. II
On Wed, Nov 3, 2010 at 7:00 PM, James Edward G. II
[email protected] wrote:
The two lines above can be replaced with the more efficient:
last = s[/.\z/mu] || ‘’
At this rate the method is going to disappear.
I updated the gist accordingly:
UTF-8 aware string chop. (the firs gist was posted as anonymous) · GitHub
Thanks again,
Ammar
On Thu, Nov 4, 2010 at 1:25 AM, Ammar A. [email protected] wrote:
On Wed, Nov 3, 2010 at 7:00 PM, James Edward G. II
last = s[/.\z/mu] || ‘’
I updated the gist accordingly:
UTF-8 aware string chop. (the firs gist was posted as anonymous) · GitHub
can we make that a one pass?
str =~ /.\z/mu
[$`,$&]
best regards -botp
Ammar A. wrote in post #959047:
By the way, the m options seems superfluous in James’ version. I get
the same results without it.
foo = “abc\n”
=> “abc\n”foo.sub(/.\z/mu, ‘’)
=> “abc”foo.sub(/.\z/u, ‘’)
=> “abc\n”
On Thu, Nov 4, 2010 at 4:37 PM, Brian C. [email protected]
wrote:
Ammar A. wrote in post #959047:
By the way, the m options seems superfluous in James’ version. I get
the same results without it.foo = “abc\n”
=> “abc\n”
foo.sub(/.\z/mu, ‘’)
=> “abc”
foo.sub(/.\z/u, ‘’)
=> “abc\n”
James clarified this earlier. But thanks for chiming in nonetheless.
Cheers,
Ammar
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs