Forum: Ruby Common Traps for C extensions

Posted by Bernhard Brodowsky (lykos)
on 2012-08-29 20:08
Hi, I am writing a toy library in C++ and currently, I am writing a Ruby
extensions for it.

I have one C++ layer to catch all the C++ exceptions, convert them into
error codes wich casts the void Pointers it gets from C to the
appropriate classes etc. Then I have one C layer which handles the Ruby
datatypes and raises the correct exceptions.

But now I am thinking of possible traps I might run into. For example,
the Ruby interpreter forces me to differentiate between allocation and
initialization and in my current implementation, the User could redefine
the initialize() method of my class and then call another method, which
results in undefined behaviour or possibly segfault. I can easily solve
that problem (e.g. by setting some internal flag) but there are probably
a thousand other typical traps like that, where the dynamicity of Ruby
messes with my C memory management. Do you know any important others?
Posted by Robert Klemme (robert_k78)
on 2012-08-30 09:18
(Received via mailing list)
On Wed, Aug 29, 2012 at 8:08 PM, Bernhard Brodowsky
<lists@ruby-forum.com> wrote:
> Hi, I am writing a toy library in C++ and currently, I am writing a Ruby
> extensions for it.
>
> I have one C++ layer to catch all the C++ exceptions, convert them into
> error codes wich casts the void Pointers it gets from C to the
> appropriate classes etc. Then I have one C layer which handles the Ruby
> datatypes and raises the correct exceptions.

Wouldn't it be simpler to implement just one layer of C++ functions
with extern "C" that do all the adjustments (i.e. catch C++ exceptions
and convert types)?

> But now I am thinking of possible traps I might run into. For example,
> the Ruby interpreter forces me to differentiate between allocation and
> initialization and in my current implementation, the User could redefine
> the initialize() method of my class and then call another method, which
> results in undefined behaviour or possibly segfault.

I am not sure I understand the scenario. Are you talking about a user
redefining #initialize in Ruby land leading to improperly initialized
C / C++ data structures?

> I can easily solve
> that problem (e.g. by setting some internal flag) but there are probably
> a thousand other typical traps like that, where the dynamicity of Ruby
> messes with my C memory management. Do you know any important others?

I never did serious C extension coding so I can't help you with
general guidelines.  Storing something which verifies integrity of the
C++ data structures is certainly a good idea.  If I think about it,
isn't it sufficient to check whether a pointer to the C++ struct is
valid, i.e. not NULL?  It certainly depends on how you design the
interface between Ruby and C / C++ world: you could completely rely on
C / C++ state or make use of Ruby instance variables from C / C++
which would probably make things more complicated.

Kind regards

robert
Posted by Bernhard Brodowsky (lykos)
on 2012-08-30 09:34
Robert Klemme wrote in post #1073869:
> On Wed, Aug 29, 2012 at 8:08 PM, Bernhard Brodowsky
> <lists@ruby-forum.com> wrote:
>> Hi, I am writing a toy library in C++ and currently, I am writing a Ruby
>> extensions for it.
>>
>> I have one C++ layer to catch all the C++ exceptions, convert them into
>> error codes wich casts the void Pointers it gets from C to the
>> appropriate classes etc. Then I have one C layer which handles the Ruby
>> datatypes and raises the correct exceptions.
>
> Wouldn't it be simpler to implement just one layer of C++ functions
> with extern "C" that do all the adjustments (i.e. catch C++ exceptions
> and convert types)?
>
>> But now I am thinking of possible traps I might run into. For example,
>> the Ruby interpreter forces me to differentiate between allocation and
>> initialization and in my current implementation, the User could redefine
>> the initialize() method of my class and then call another method, which
>> results in undefined behaviour or possibly segfault.
>
> I am not sure I understand the scenario. Are you talking about a user
> redefining #initialize in Ruby land leading to improperly initialized
> C / C++ data structures?
>
>> I can easily solve
>> that problem (e.g. by setting some internal flag) but there are probably
>> a thousand other typical traps like that, where the dynamicity of Ruby
>> messes with my C memory management. Do you know any important others?
>
> I never did serious C extension coding so I can't help you with
> general guidelines.  Storing something which verifies integrity of the
> C++ data structures is certainly a good idea.  If I think about it,
> isn't it sufficient to check whether a pointer to the C++ struct is
> valid, i.e. not NULL?  It certainly depends on how you design the
> interface between Ruby and C / C++ world: you could completely rely on
> C / C++ state or make use of Ruby instance variables from C / C++
> which would probably make things more complicated.
>
> Kind regards
>
> robert

Hi, thanks for your answer. Somehow it didn't compile as C++ code, if I 
do the conversions, so I just thought it is not possible and divided it 
into one part that is compiled as C++ with extern C functions and one 
part that is compiled as C.

Exactly, that is what I am talking about, checking for non-null does not 
suffice, because Ruby calls my alloc function first, so the pointer is 
actually valid, but it points to an uninitialized object.
Posted by Robert Klemme (robert_k78)
on 2012-08-30 11:23
(Received via mailing list)
On Thu, Aug 30, 2012 at 9:34 AM, Bernhard Brodowsky
<lists@ruby-forum.com> wrote:

> Hi, thanks for your answer. Somehow it didn't compile as C++ code, if I
> do the conversions,

Well, as long as we do not see the code and / or the error we can't
really tell why it did not compile.  I created a test this morning
which worked but threw it away.  You do need to compile the catching
and conversion code with extern "C" with a C++ compiler though.

> Exactly, that is what I am talking about, checking for non-null does not
> suffice, because Ruby calls my alloc function first, so the pointer is
> actually valid, but it points to an uninitialized object.

Yeah, but your allocation function could place one or more NULL
pointers in the structure which get filled later, couldn't it?

Kind regards

robert
Posted by Henry Maddocks (Guest)
on 2012-08-31 00:41
(Received via mailing list)
On 30/08/2012, at 7:34 PM, Bernhard Brodowsky wrote:

>
> But now I am thinking of possible traps I might run into. For example,
> the Ruby interpreter forces me to differentiate between allocation and
> initialization and in my current implementation, the User could redefine
> the initialize() method of my class and then call another method, which
> results in undefined behaviour or possibly segfault.


Someone might sub-class and not call super.

If it's possible to seg fault your extension then you are doing it 
wrong. After calling the alloc function, your class might not be 
'valid', but it must be 'safe'. This is the reason Ruby added the alloc 
hook.

Remember, when you are writing an extension you are effectively writing 
Ruby and in Ruby implementing initialize is optional and nil (or Qnil) 
is a valid value. You should bare this in mind when designing your 
extension.

Henry
Posted by Bernhard Brodowsky (lykos)
on 2012-09-03 01:52
Robert Klemme wrote in post #1073887:
> On Thu, Aug 30, 2012 at 9:34 AM, Bernhard Brodowsky
> <lists@ruby-forum.com> wrote:
>
>> Hi, thanks for your answer. Somehow it didn't compile as C++ code, if I
>> do the conversions,
>
> Well, as long as we do not see the code and / or the error we can't
> really tell why it did not compile.  I created a test this morning
> which worked but threw it away.  You do need to compile the catching
> and conversion code with extern "C" with a C++ compiler though.
>

I defined a function with two VALUE args and a VALUE return type and I 
wanted this to be a method with one argument (plus the self Argument) 
but I always got this error:

error: invalid conversion from 'VALUE (*)(VALUE, VALUE) {aka long 
unsigned int (*)(long unsigned int, long unsigned int)}' to 'VALUE 
(*)(...) {aka long unsigned int (*)(...)}' [-fpermissive]


>> Exactly, that is what I am talking about, checking for non-null does not
>> suffice, because Ruby calls my alloc function first, so the pointer is
>> actually valid, but it points to an uninitialized object.
>
> Yeah, but your allocation function could place one or more NULL
> pointers in the structure which get filled later, couldn't it?
>

Yes, I solved it exactly this way.

Henry Maddocks wrote in post #1073985:
> On 30/08/2012, at 7:34 PM, Bernhard Brodowsky wrote:
>
>>
>> But now I am thinking of possible traps I might run into. For example,
>> the Ruby interpreter forces me to differentiate between allocation and
>> initialization and in my current implementation, the User could redefine
>> the initialize() method of my class and then call another method, which
>> results in undefined behaviour or possibly segfault.
>
>
> Someone might sub-class and not call super.
>
> If it's possible to seg fault your extension then you are doing it
> wrong. After calling the alloc function, your class might not be
> 'valid', but it must be 'safe'. This is the reason Ruby added the alloc
> hook.

Yes, that is what I am trying to achieve, but it seems to be more 
difficult than I thought in the first moment, but it is safe for any 
scenario I can think of right now.

Cheers,
Bernhard
Posted by Robert Klemme (robert_k78)
on 2012-09-03 09:40
(Received via mailing list)
On Mon, Sep 3, 2012 at 1:52 AM, Bernhard Brodowsky 
<lists@ruby-forum.com> wrote:
>> and conversion code with extern "C" with a C++ compiler though.
>
> I defined a function with two VALUE args and a VALUE return type and I
> wanted this to be a method with one argument (plus the self Argument)
> but I always got this error:
>
> error: invalid conversion from 'VALUE (*)(VALUE, VALUE) {aka long
> unsigned int (*)(long unsigned int, long unsigned int)}' to 'VALUE
> (*)(...) {aka long unsigned int (*)(...)}' [-fpermissive]

As I said, as long as we do not see the code...  I suggest you create
a http://sscce.org/ and post it.

Cheers

robert
Posted by Bernhard Brodowsky (lykos)
on 2012-09-04 13:16
Robert Klemme wrote in post #1074390:
> On Mon, Sep 3, 2012 at 1:52 AM, Bernhard Brodowsky
> <lists@ruby-forum.com> wrote:
>>> and conversion code with extern "C" with a C++ compiler though.
>>
>> I defined a function with two VALUE args and a VALUE return type and I
>> wanted this to be a method with one argument (plus the self Argument)
>> but I always got this error:
>>
>> error: invalid conversion from 'VALUE (*)(VALUE, VALUE) {aka long
>> unsigned int (*)(long unsigned int, long unsigned int)}' to 'VALUE
>> (*)(...) {aka long unsigned int (*)(...)}' [-fpermissive]
>
> As I said, as long as we do not see the code...  I suggest you create
> a http://sscce.org/ and post it.
>
> Cheers
>
> robert

Thanks anyway, but I rewrote everything and I am using Rice now, which 
works much better except that it doesn't automatically handle this 
problem if the user redefines initialize and it is a little cumbersome 
to do a workaround since Rice assumes it defines the initialize method 
itself etc. Maybe I should write this to the Rice developers because I 
guess nobody wants a Segfault if the user redefines initialize.

The smallest code to reproduce my previous looks about like this:

extern "C" {

VALUE encrypt(VALUE self, VALUE key) {
  return qNil;
}

static VALUE Encrypter = qNil;

Init_RubyCrypto() {
  Encrypter = rb_define_class("Encrypter", rb_cObject);
  rb_define_method(Encrypter, "encrypt", encrypt, 2);
}

}

The problem appears to be that rb_define_method takes a function pointer 
with a variable number of arguments, but encrypt has 2 arguments and C 
somehow implicitely casts this while C++ complains. But it doesn't 
matter anyway now since I use Rice now.

Thanks for the help
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.