Forum: Ruby UTF in Regexp

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
17d2162b49d60d0ba1a35290f0d33dcf?d=identicon&s=25 Wido Menhardt (abu-mats)
on 2007-02-03 20:15
I am sorrrrrry, but I am banging my head against this, and can't seem to
find the answer!

Text gets displayed in an input field in a web page with “
prepended and ” appended to the string (needs to be inside the
string otherwise it looks funny). The user edits it, and when it comes
back to the (Rails) backend, the new string with (possibly) these quotes
attached comes back, but in unicode.

So the string possibly starts with UTF “ and possibly ends with
UTF ”

I want to do a regexp removal. Here is what works (but I am embarrased):

  ldquo = '123'; ldquo[0] = 226; ldquo[1] = 128; ldquo[2] = 156
  rdquo = '123'; rdquo[0] = 226; rdquo[1] = 128; rdquo[2] = 157
  string.gsub!(/(\A#{ldquo}|#{rdquo}\Z)/,'')

There must be a better way.


Abu Mats al-Nemsi
97550977337c9f0a0e1a9553e55bfaa0?d=identicon&s=25 Jan Svitok (Guest)
on 2007-02-03 21:12
(Received via mailing list)
On 2/3/07, Wido Menhardt <a@menhardt.com> wrote:
> So the string possibly starts with UTF &ldquo; and possibly ends with
> UTF &rdquo;
>
> I want to do a regexp removal. Here is what works (but I am embarrased):
>
>   ldquo = '123'; ldquo[0] = 226; ldquo[1] = 128; ldquo[2] = 156
>   rdquo = '123'; rdquo[0] = 226; rdquo[1] = 128; rdquo[2] = 157
>   string.gsub!(/(\A#{ldquo}|#{rdquo}\Z)/,'')
>
> There must be a better way.

1. it's possible to insert the chars directly, either in octal (226 =
"\342") or hexa (226= "\xe2")

string.gsub!(\A\xe2\x80\x9c|\xe2\x80\9d\Z/,")

2. | has low priority, so your regex is equal to /(\Alquo)|(rquo\z)/.
the correct one is (notice the non-capturing group (?:...)

string.gsub!(\A(?:\xe2\x80\x9c|\xe2\x80\9d)\Z/,")

3. there's iconv library that will convert things for you.
This topic is locked and can not be replied to.