Within C extension code, what are the appropriate C functions to use that provide the equivalent functionality of String#force_encoding and String#encode? I wasn't able to find anything in the README.EXT (http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_19...). = String#force_encoding Based on these articles [1] [2], it seems as though the the suggested approach for forcing an encoding change in a C extension is to use the 'rb_enc_associate_index' function defined in 'ruby/encoding.h'. Is that accurate? The implementation of String#force_encoding does a bit more than that: static VALUE rb_str_force_encoding(VALUE str, VALUE enc) { str_modifiable(str); rb_enc_associate(str, rb_to_encoding(enc)); ENC_CODERANGE_CLEAR(str); return str; } This implementation adds an invocation of the ENC_CODERANGE_CLEAR macro and I'm not clear on what this does or when/why it would be needed. = String#encode I wasn't able to find any documentation about String transcoding within C extensions. Based on the implementation of String#encode, there seems to be a function 'rb_str_encode' in 'ruby/encoding.h' that might be appropriate, but there's no documentation for the method and I couldn't completely reverse engineer how the 'ecflags' and 'ecopts' arguments are used. -Nathan [1] http://yugui.jp/articles/838 [2] http://tenderlovemaking.com/2009/06/26/string-enco...
on 2012-07-19 03:09
on 2012-07-19 15:42
Nathan Beyer wrote in post #1069253: > I wasn't able to find anything in the README.EXT > (http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_19...). Welcome to ruby 1.9.x, where everything to do with string encodings is completely undocumented. > The implementation of String#force_encoding does a bit more than that: > > static VALUE > rb_str_force_encoding(VALUE str, VALUE enc) > { > str_modifiable(str); > rb_enc_associate(str, rb_to_encoding(enc)); > ENC_CODERANGE_CLEAR(str); > return str; > } > > This implementation adds an invocation of the ENC_CODERANGE_CLEAR > macro and I'm not clear on what this does or when/why it would be > needed. I believe this is essentially the same as rb_str_modify(). It clears the cache of properties like 'ascii_only?' and 'valid_encoding?', so that next time someone queries them it has to scan the whole string. > = String#encode > I wasn't able to find any documentation about String transcoding > within C extensions. Based on the implementation of String#encode, > there seems to be a function 'rb_str_encode' in 'ruby/encoding.h' that > might be appropriate, but there's no documentation for the method and > I couldn't completely reverse engineer how the 'ecflags' and 'ecopts' > arguments are used. Maybe easier just to rb_funcall it? Good luck, Brian.
on 2012-07-25 01:15
On Thu, Jul 19, 2012 at 8:42 AM, Brian Candler <lists@ruby-forum.com> wrote: > Nathan Beyer wrote in post #1069253: >> I wasn't able to find anything in the README.EXT >> (http://svn.ruby-lang.org/repos/ruby/tags/v1_9_3_19...). > > Welcome to ruby 1.9.x, where everything to do with string encodings is > completely undocumented. I've noticed ... :) >> } >> I wasn't able to find any documentation about String transcoding >> within C extensions. Based on the implementation of String#encode, >> there seems to be a function 'rb_str_encode' in 'ruby/encoding.h' that >> might be appropriate, but there's no documentation for the method and >> I couldn't completely reverse engineer how the 'ecflags' and 'ecopts' >> arguments are used. > > Maybe easier just to rb_funcall it? That's what I've done, but I wanted to check if there was something more appropriate I should be doing while in the C code.
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.