Hi!
I have the ruby vm embedded via the public ruby_* functions and execute
scripts via rb_eval_string_protect. The version I am using is ruby
2.1.0dev (2013-09-27 trunk 43059), and I’m working on 64bit Linux
(Fedora17). So far, it has been working okay for the most part. However,
I recently discovered that apparently UTF8 strings aren’t properly
recognized by my scripts.
As an example, if I run the following code via the normal ruby
executable:
p "éàß"
I get:
> "éàß"
back as is expected. However, doing the same thing via my embedded C++
app:
#include "ruby.h"
int main(int argc, char**argv)
{
ruby_setup();
const char rbScript[] = "p \"éàß\"";
rb_eval_string_protect(rbScript, 0);
return 0;
}
I get:
> "\xC3\xA9\xC3\xA0\xC3\x9F"
I feel like I’m just missing one tiny init function somewhere that sets
up the proper UTF8 support, but even after going through the main()
function of the ruby executable and all the called functions, I can’t
seem to find this magic init procedure.
The reason why this is especially problematic for me is because if I try
iterating over the individual chars of a string, I get each single UTF8
byte code instead of the full multibyte character.
Can someone help me out? Thanks in advance!
Jonas