Multibyte and Gems

I’ve tracked down a problem with a Gem I am trying to use. It turns
out that it has some non-ascii characters in it; for example the
second quote in the regular expression below is not an ASCII character:

 parts = self.split( %r/( [:.;?!][ ] | (?:[ ]|^)["“] )/x )

It produces errors like this:

:in `require’: /opt/local/lib/ruby1.9/gems/1.9.1/gems/webby-0.9.4/lib/
webby/core_ext/string.rb:14: invalid multibyte char (US-ASCII)
(SyntaxError)

I fixed it by adding the following to the top of the offending file:

encoding: utf-8

My questions:

  • Is this the preferred fix?

  • Is there a way to work around this problem without modifying the Gem?

  • Is there an easy way to see if gems have non-ascii source files but
    haven’t included an encoding comment? Some kind of Ruby warning for
    instance.

On Jun 22, 2009, at 22:26, Martin H. wrote:

lib/webby/core_ext/string.rb:14: invalid multibyte char (US-ASCII)
(SyntaxError)

I fixed it by adding the following to the top of the offending file:

encoding: utf-8

My questions:

  • Is this the preferred fix?

Yes.

  • Is there a way to work around this problem without modifying the
    Gem?

File a bug with the author and have them release a new version,
otherwise no.

  • Is there an easy way to see if gems have non-ascii source files
    but haven’t included an encoding comment? Some kind of Ruby warning
    for instance.

ruby -c will do this for you.

So is it considered best practice to put an encoding comment at the
begging of all your files now days? Such as:

encoding: utf-8

or whatever encoding you like. is this what people are doing or are
they doing it one off for the files that have non-ascii characters?

It seems to me that if you have a modern editor it isn’t too hard to
accidentally slip in some non-ascii characters resulting in some pain
down the road.

On Wed, Jun 24, 2009 at 1:43 AM, Martin H.[email protected] wrote:

the road.
It isn’t hard to mess up any code in a lot of ways, so as usual, try
to run/test it before you release/deploy :slight_smile:
That also means that using Ruby 1.9.1 for your daily coding might be a
better choice, otherwise you’ll have to use multiruby.

On Jun 24, 2009, at 19:08, Michael F. wrote:

It seems to me that if you have a modern editor it isn’t too hard to
accidentally slip in some non-ascii characters resulting in some
pain down
the road.

It isn’t hard to mess up any code in a lot of ways, so as usual, try
to run/test it before you release/deploy :slight_smile:
That also means that using Ruby 1.9.1 for your daily coding might be a
better choice, otherwise you’ll have to use multiruby.

With hoe, it’s as easy as:

multiruby_setup the_usual # only once
rake multi