Within C extension code, what are the appropriate C functions to use
that provide the equivalent functionality of String#force_encoding and
String#encode? I wasn’t able to find anything in the README.EXT
(http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_194/README.EXT).
= String#force_encoding
Based on these articles [1] [2], it seems as though the the suggested
approach for forcing an encoding change in a C extension is to use the
‘rb_enc_associate_index’ function defined in ‘ruby/encoding.h’. Is
that accurate?
The implementation of String#force_encoding does a bit more than that:
static VALUE
rb_str_force_encoding(VALUE str, VALUE enc)
{
str_modifiable(str);
rb_enc_associate(str, rb_to_encoding(enc));
ENC_CODERANGE_CLEAR(str);
return str;
}
This implementation adds an invocation of the ENC_CODERANGE_CLEAR
macro and I’m not clear on what this does or when/why it would be
needed.
= String#encode
I wasn’t able to find any documentation about String transcoding
within C extensions. Based on the implementation of String#encode,
there seems to be a function ‘rb_str_encode’ in ‘ruby/encoding.h’ that
might be appropriate, but there’s no documentation for the method and
I couldn’t completely reverse engineer how the ‘ecflags’ and ‘ecopts’
arguments are used.
Welcome to ruby 1.9.x, where everything to do with string encodings is
completely undocumented.
The implementation of String#force_encoding does a bit more than that:
static VALUE
rb_str_force_encoding(VALUE str, VALUE enc)
{
str_modifiable(str);
rb_enc_associate(str, rb_to_encoding(enc));
ENC_CODERANGE_CLEAR(str);
return str;
}
This implementation adds an invocation of the ENC_CODERANGE_CLEAR
macro and I’m not clear on what this does or when/why it would be
needed.
I believe this is essentially the same as rb_str_modify(). It clears the
cache of properties like ‘ascii_only?’ and ‘valid_encoding?’, so that
next time someone queries them it has to scan the whole string.
= String#encode
I wasn’t able to find any documentation about String transcoding
within C extensions. Based on the implementation of String#encode,
there seems to be a function ‘rb_str_encode’ in ‘ruby/encoding.h’ that
might be appropriate, but there’s no documentation for the method and
I couldn’t completely reverse engineer how the ‘ecflags’ and ‘ecopts’
arguments are used.
Welcome to ruby 1.9.x, where everything to do with string encodings is
completely undocumented.
I’ve noticed …
}
I wasn’t able to find any documentation about String transcoding
within C extensions. Based on the implementation of String#encode,
there seems to be a function ‘rb_str_encode’ in ‘ruby/encoding.h’ that
might be appropriate, but there’s no documentation for the method and
I couldn’t completely reverse engineer how the ‘ecflags’ and ‘ecopts’
arguments are used.
Maybe easier just to rb_funcall it?
That’s what I’ve done, but I wanted to check if there was something
more appropriate I should be doing while in the C code.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.