Forum: Ruby-core ruby_init() and C call stack

79f9616c279f40ef953f366dcfe38dc8?d=identicon&s=25 Suraj N. Kurapati (Guest)
on 2008-04-12 23:07
(Received via mailing list)
Hello,

From what I experienced so far, it seems that ruby_init() makes the
following assumption about the C call stack.  Please correct me.

    The C call stack must NOT shrink beyond its
    state at the time of ruby_init() invocation.

What I mean by this is:  we must never return from the C function
body where ruby_init() is called, because doing so will shrink the C
call stack.

Is this true?

Thanks for your consideration.
0ec4920185b657a03edf01fff96b4e9b?d=identicon&s=25 Yukihiro Matsumoto (Guest)
on 2008-04-13 00:43
(Received via mailing list)
Hi,

In message "Re: ruby_init() and C call stack"
    on Sun, 13 Apr 2008 06:06:45 +0900, "Suraj N. Kurapati"
<sunaku@gmail.com> writes:

|    The C call stack must NOT shrink beyond its
|    state at the time of ruby_init() invocation.

|Is this true?

No Ruby object should be referred from stack region lower than the
position at the time of ruby_init() invocation.  It is difficult to
ensure this portable, we can say it is safe to keep your assumption.

              matz.
79f9616c279f40ef953f366dcfe38dc8?d=identicon&s=25 Suraj N. Kurapati (Guest)
on 2008-04-13 03:48
(Received via mailing list)
Yukihiro Matsumoto wrote:
> No Ruby object should be referred from stack region lower than the
> position at the time of ruby_init() invocation.  It is difficult to
> ensure this portable, we can say it is safe to keep your assumption.

Thank you for the clear explanation!  Now that this is clarified, I
have another question:

For Ruby 1.8, I was able to guarantee the assumption by only using
Ruby's C API inside a POSIX thread:  the thread preserves the C call
stack for the life of the Ruby interpreter.

However, for Ruby 1.9, it is not possible to invoke ruby_init()
inside a POSIX thread (as both you and Nobu have explained in
ruby-core:15759 and ruby-core:15760).

Therefore, how can I guarantee the assumption for Ruby 1.9?


I have one idea, so far, for a possible solution:

1. calloc() a region of memory as a "sandbox" for Ruby.

2. Whenever you want to use Ruby's C API:

2.a. make a backup of the current system stack pointer

2.b. assign the address of the sandbox to the system stack pointer

2.c. execute the desired C function in Ruby's C API

2.d. restore the original system stack pointer

This may not be possible directly, but something along these lines.

Does anyone have other ideas or suggestions?

Thanks for your consideration.
79f9616c279f40ef953f366dcfe38dc8?d=identicon&s=25 Suraj N. Kurapati (Guest)
on 2008-04-17 17:23
(Received via mailing list)
Suraj N. Kurapati wrote:
> ruby-core:15759 and ruby-core:15760).
>
> Therefore, how can I guarantee this assumption for Ruby 1.9?

Anyone?  Is this an impossible task?  Is there a work-around?

Please suggest some ideas or comment on my idea of calloc()ing a
private Stack area for Ruby (see parent e-mail).

Thanks for your consideration.

P.S. I really wish Ruby's C API was like OpenGL: a state machine.
It doesn't matter from what stack position you query/modify the
machine  state.

This tiny assumption would make it possible for me to embed Ruby 1.9
inside a Verilog simulator, which loads my C extension (shared
object file / DLL) into itself and calls C functions in my C extension.
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (nobu)
on 2008-04-17 18:38
(Received via mailing list)
Hi,

At Fri, 18 Apr 2008 00:22:52 +0900,
Suraj N. Kurapati wrote in [ruby-core:16432]:
> > However, for Ruby 1.9, it is not possible to invoke ruby_init()
> > inside a POSIX thread (as both you and Nobu have explained in
> > ruby-core:15759 and ruby-core:15760).
> >
> > Therefore, how can I guarantee this assumption for Ruby 1.9?

Why can't you call ruby_sysinit() and ruby_init() in the maint
thread?

> Please suggest some ideas or comment on my idea of calloc()ing a
> private Stack area for Ruby (see parent e-mail).

Seems like what native thread system does.

> P.S. I really wish Ruby's C API was like OpenGL: a state machine.
> It doesn't matter from what stack position you query/modify the
> machine  state.

OpenGL has its own GC?
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (cout)
on 2008-04-17 19:21
(Received via mailing list)
On Fri, Apr 18, 2008 at 12:22:52AM +0900, Suraj N. Kurapati wrote:
> P.S. I really wish Ruby's C API was like OpenGL: a state machine.
> It doesn't matter from what stack position you query/modify the
> machine  state.

This would perhaps allow you to run multiple interpreters in the same
process, but it would not solve the stack initialization problem.  Ruby
needs to know the size and location of the stack so the GC can mark
objects on the stack.

Paul
79f9616c279f40ef953f366dcfe38dc8?d=identicon&s=25 Suraj N. Kurapati (Guest)
on 2008-04-18 03:43
(Received via mailing list)
Nobuyoshi Nakada wrote:
>>> inside a POSIX thread (as both you and Nobu have explained in
>>> ruby-core:15759 and ruby-core:15760).
>>>
>>> Therefore, how can I guarantee this assumption for Ruby 1.9?
>
> Why can't you call ruby_sysinit() and ruby_init() in the maint
> thread?

Good question.  Allow me to explain my use case:

I invoke a Verilog simulator by specifying the path to my shared
object file.  The simulator loads my shared object file and
immediately it invokes a predefined function.  (Note that the
simulator is the main thread.)

Inside this function, I call ruby_init() and then register (with the
simulator) a callback function that must be executed at the start of
the simulation.

After this function returns, the simulator does some other
initialization and then prepares to begin the simulation.

Since I registered a start-of-simulation callback, the simulator
first invokes my callback function (from a *different* C call stack
position in a lower region of memory than the invocation of the
predefined function) before really beginning the simulation.

As a result, any usage of Ruby's C API inside my callback function
causes a segmentation fault.  This is because Ruby was initialized
in one stack region, but now the Ruby C API is being used in a lower
stack region.

For this reason, I am very interested in shielding Ruby from the
main thread's inconsistent C call stack by running Ruby inside a
pthread.  The pthread preserves (for the life of the Ruby
interpreter) the C call stack where ruby_init() was called.

For example, here is some pseudocode describing this technique:

void* the_ruby_protector(void* dummy)
{
   ruby_init();

   // wait for start-of-simulation callback
   protector_is_done();  // wake up wait_for_protector() callers
   wait_for_simulator();

   // use the Ruby C API
   rb_funcall( ... );
   rb_const_get( ... );
}

void the_predefined_function()
{
   pthread_create( ... the_ruby_protector ...);

   // wait until protector is finished initializing Ruby
   wait_for_protector();
}

void the_callback_function()
{
   // resume the protector (allow it to continue executing)
   simulator_is_done();  // wake up wait_for_simulator() callers
   wait_for_protector();
}

The wait_for() and is_done() functions are just wrappers around
pthread_mutex synchronization primitives.  They allow one thread to
be running at a time.

>> I really wish Ruby's C API was like OpenGL: a state machine.
>> It doesn't matter from what stack position you query/modify the
>> machine  state.
>
> OpenGL has its own GC?

Excellent point.  However, it could theoretically have its own GC
and hide that fact from a user by malloc()ing its own stack region
instead of relying on the C call stack position during init.  :-)
79f9616c279f40ef953f366dcfe38dc8?d=identicon&s=25 Suraj N. Kurapati (Guest)
on 2008-04-18 03:57
(Received via mailing list)
Paul Brannan wrote:
> On Fri, Apr 18, 2008 at 12:22:52AM +0900, Suraj N. Kurapati wrote:
>> I really wish Ruby's C API was like OpenGL: a state machine.
>> It doesn't matter from what stack position you query/modify the
>> machine  state.
>
> This would perhaps allow you to run multiple interpreters in the same
> process,

I don't think this would allow multiple interpreters because there
is still only one global instance of the state machine (like in OpenGL).

Instead, multiple interpreters would be possible if ruby_init()
returned a token (which identifies a particular instance of the
interpreter) and you used the token whenever calling Ruby's C API.

Alternatively, you could attach/detach a particular Ruby instance
to/from the global state machine.  This is similar to pushing and
popping ModelView matrices in OpenGL and operating on one matrix at
a time.

> but it would not solve the stack initialization problem.  Ruby
> needs to know the size and location of the stack so the GC can mark
> objects on the stack.

If ruby_init() dynamically allocated its own stack region, then it
would know both the size and location of the stack.

As a result, you could use Ruby's C API from any C call stack
without causing trouble for Ruby (since Ruby has its own private
stack which is hidden and isolated from the user).
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (nobu)
on 2008-04-18 10:38
(Received via mailing list)
Hi,

At Fri, 18 Apr 2008 10:41:34 +0900,
Suraj N. Kurapati wrote in [ruby-core:16436]:
> Since I registered a start-of-simulation callback, the simulator
> first invokes my callback function (from a *different* C call stack
> position in a lower region of memory than the invocation of the
> predefined function) before really beginning the simulation.

I think I get the point, the main thread stack position can
shink under the initialized position, and would cause
underflow.

Can't you try with the latest revision?

> void* the_ruby_protector(void* dummy)
> {
>    wait_for_simulator();
>
>    // use the Ruby C API
>    rb_funcall( ... );
>    rb_const_get( ... );
> }
>

  VALUE the_ruby_thread;

> void the_predefined_function()
> {
     ruby_sysinit();
     RUBY_INIT_STACK;
     ruby_init();
     rb_gc_register_address(&the_ruby_thread);
     the_ruby_thread = ruby_thread_create( ... the_ruby_protector ...);
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (cout)
on 2008-04-18 14:32
(Received via mailing list)
On Fri, Apr 18, 2008 at 10:57:14AM +0900, Suraj N. Kurapati wrote:
> As a result, you could use Ruby's C API from any C call stack
> without causing trouble for Ruby (since Ruby has its own private
> stack which is hidden and isolated from the user).

Ruby needs to know the location of the C stack so it can be marked,
otherwise extensions would break.

Paul
79f9616c279f40ef953f366dcfe38dc8?d=identicon&s=25 Suraj N. Kurapati (Guest)
on 2008-04-19 06:45
(Received via mailing list)
Hi,

Nobuyoshi Nakada wrote:
> At Fri, 18 Apr 2008 10:41:34 +0900,
> Suraj N. Kurapati wrote in [ruby-core:16436]:
>> Since I registered a start-of-simulation callback, the simulator
>> first invokes my callback function (from a *different* C call stack
>> position in a lower region of memory than the invocation of the
>> predefined function) before really beginning the simulation.
>
> I think I get the point, the main thread stack position can
> shink under the initialized position, and would cause
> underflow.

Thanks for your suggestion! [snipped]

I am happy to see that rb_gc_register_address() is also available in
the Ruby 1.8.6 stable release.

> Can't you try with the latest revision?

I tried your suggestion with ruby 1.9 trunk r16077 and the ruby
1.8.6 stable release, and it seems that the problem lies inside the
simulator itself:

* With ruby 1.9 trunk, the simulator crashes when calling
Init_prelude(), regardless of whether rb_gc_register_address() is used.

* With ruby 1.8.6, the simulator crashes when calling rb_funcall()
inside the_callback_function(), regardless of whether
rb_gc_register_address() is used.

I can now sleep in peace knowing that the particular simulator I was
using (NC-Verilog) is the troublemaker. :-)  Thankfully, there are
other Verilog simulators available (Pragmatic CVER, Synopsys VCS,
Mentor Modelsim) which have no problem running my C extension,
regardless of whether rb_gc_register_address() is used.

Thanks for your help.
This topic is locked and can not be replied to.