What are the C extension analogs of String#force_encoding and String#encode?

Within C extension code, what are the appropriate C functions to use
that provide the equivalent functionality of String#force_encoding and
String#encode? I wasn’t able to find anything in the README.EXT
(http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_194/README.EXT).

= String#force_encoding
Based on these articles [1] [2], it seems as though the the suggested
approach for forcing an encoding change in a C extension is to use the
‘rb_enc_associate_index’ function defined in ‘ruby/encoding.h’. Is
that accurate?

The implementation of String#force_encoding does a bit more than that:

static VALUE
rb_str_force_encoding(VALUE str, VALUE enc)
{
str_modifiable(str);
rb_enc_associate(str, rb_to_encoding(enc));
ENC_CODERANGE_CLEAR(str);
return str;
}

This implementation adds an invocation of the ENC_CODERANGE_CLEAR
macro and I’m not clear on what this does or when/why it would be
needed.

= String#encode
I wasn’t able to find any documentation about String transcoding
within C extensions. Based on the implementation of String#encode,
there seems to be a function ‘rb_str_encode’ in ‘ruby/encoding.h’ that
might be appropriate, but there’s no documentation for the method and
I couldn’t completely reverse engineer how the ‘ecflags’ and ‘ecopts’
arguments are used.

-Nathan

[1] How to port your gem to Ruby 1.9 - 世界線航跡蔵
[2]
String Encoding in Ruby 1.9 C extensions | Tenderlove Making

Nathan B. wrote in post #1069253:

I wasn’t able to find anything in the README.EXT
(http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_194/README.EXT).

Welcome to ruby 1.9.x, where everything to do with string encodings is
completely undocumented.

The implementation of String#force_encoding does a bit more than that:

static VALUE
rb_str_force_encoding(VALUE str, VALUE enc)
{
str_modifiable(str);
rb_enc_associate(str, rb_to_encoding(enc));
ENC_CODERANGE_CLEAR(str);
return str;
}

This implementation adds an invocation of the ENC_CODERANGE_CLEAR
macro and I’m not clear on what this does or when/why it would be
needed.

I believe this is essentially the same as rb_str_modify(). It clears the
cache of properties like ‘ascii_only?’ and ‘valid_encoding?’, so that
next time someone queries them it has to scan the whole string.

= String#encode
I wasn’t able to find any documentation about String transcoding
within C extensions. Based on the implementation of String#encode,
there seems to be a function ‘rb_str_encode’ in ‘ruby/encoding.h’ that
might be appropriate, but there’s no documentation for the method and
I couldn’t completely reverse engineer how the ‘ecflags’ and ‘ecopts’
arguments are used.

Maybe easier just to rb_funcall it?

Good luck,

Brian.

On Thu, Jul 19, 2012 at 8:42 AM, Brian C. [email protected]
wrote:

Nathan B. wrote in post #1069253:

I wasn’t able to find anything in the README.EXT
(http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_194/README.EXT).

Welcome to ruby 1.9.x, where everything to do with string encodings is
completely undocumented.

I’ve noticed … :slight_smile:

}
I wasn’t able to find any documentation about String transcoding
within C extensions. Based on the implementation of String#encode,
there seems to be a function ‘rb_str_encode’ in ‘ruby/encoding.h’ that
might be appropriate, but there’s no documentation for the method and
I couldn’t completely reverse engineer how the ‘ecflags’ and ‘ecopts’
arguments are used.

Maybe easier just to rb_funcall it?

That’s what I’ve done, but I wanted to check if there was something
more appropriate I should be doing while in the C code.