Forum: Ruby-core volatile usages

18813f71506ebad74179bf8c5a136696?d=identicon&s=25 Eric Wong (Guest)
on 2014-02-13 11:05
(Received via mailing list)
Hi all, I went ahead and removed some use of volatile which were once
needed, but no longer.  (r44929. r44930)

I'm not sure about the reason for the volatile VALUE v in re.c:
rb_reg_s_union, so I'm leaving it alone.


Also, in string.c, I'm not sure if the volatile declaration is enough,
using RB_GC_GUARD below seems more correct (and generates smaller code
on x86-32, at least):

--- a/string.c
+++ b/string.c
@@ -1401,10 +1401,12 @@ rb_str_times(VALUE str, VALUE times)
 static VALUE
 rb_str_format_m(VALUE str, VALUE arg)
 {
-    volatile VALUE tmp = rb_check_array_type(arg);
+    VALUE tmp = rb_check_array_type(arg);

     if (!NIL_P(tmp)) {
-  return rb_str_format(RARRAY_LENINT(tmp), RARRAY_CONST_PTR(tmp), str);
+  VALUE rv = rb_str_format(RARRAY_LENINT(tmp), RARRAY_CONST_PTR(tmp),
str);
+  RB_GC_GUARD(tmp);
+  return rv;
     }
     return rb_str_format(1, &arg, str);
 }
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (nobu)
on 2014-02-14 03:40
(Received via mailing list)
(2014/02/13 19:04), Eric Wong wrote:
> Also, in string.c, I'm not sure if the volatile declaration is enough,
> using RB_GC_GUARD below seems more correct (and generates smaller code
> on x86-32, at least):

LGTM.
18813f71506ebad74179bf8c5a136696?d=identicon&s=25 Eric Wong (Guest)
on 2014-02-14 11:56
(Received via mailing list)
Nobuyoshi Nakada <nobu@ruby-lang.org> wrote:
> (2014/02/13 19:04), Eric Wong wrote:
> > Also, in string.c, I'm not sure if the volatile declaration is enough,
> > using RB_GC_GUARD below seems more correct (and generates smaller code
> > on x86-32, at least):
>
> LGTM.

Thanks.  I also have a cleaner alternative to Bug #7805.

I think bare volatile use is too misunderstood/often-buggy-in-compilers
to be used for GC safety, so discourage it with RB_GC_GUARD (the lesser
of two evils).

Hopefully this will make the source clearer to other contributors.  We
should also add a section to README.EXT to document RB_GC_GUARD usage.

Something like this, but hopefully more correct, concise and clearer
than what I can write (comments/corrections appreciated):

=== RB_GC_GUARD to protect premature GC

C Ruby currently uses conservative garbage collection, thus VALUE
variables must remain visible on the stack or registers to ensure any
associated data remains usable.  Optimizing C compilers are not designed
with conservative garbage collection in mind, so they may optimize away
the original VALUE even if the code depends on data associated with that
VALUE.

The following example illustrates the use of RB_GC_GUARD to ensure
the contents of sptr remain valid while the second invocation of
rb_str_new_cstr is running.

  VALUE s, w;
  const char *sptr;

  s = rb_str_new_cstr("hello world!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!");
  sptr = RSTRING_PTR(s);
  w = rb_str_new_cstr(sptr + 6); /* Possible GC invocation */

  RB_GC_GUARD(s); /* ensure s (and thus sptr) do not get GC-ed */

In the above example, RB_GC_GUARD must be placed _after_ use of
sptr.  Placing RB_GC_GUARD before dereferencing sptr would be of no use.
RB_GC_GUARD is only effective on the VALUE data type, not converted C
data types.

RB_GC_GUARD would not be necessary at all in the above example if
non-inlined function calls are made on the `s' VALUE after sptr is
dereferenced.  Thus, in the above example, calling any un-inlined
function on `s' such as:

  rb_str_modify(s);

Will ensure `s' stays on the stack or register to prevent a
GC invocation from prematurely freeing it.


Using the RB_GC_GUARD macro is preferable to using the "volatile"
keyword in C.  RB_GC_GUARD has the following advantages:

1) the intent of the macro use is clear

2) RB_GC_GUARD only affects its call site, "volatile" generates some
   extra code every time the variable is used, hurting optimization.

3) "volatile" implementations may be buggy/inconsistent in some
   compilers and architectures. RB_GC_GUARD is customizable for broken
   systems/compilers without those without negatively affecting other
   systems.

-----------------------------------8<----------------------------------

How much would link-time optimization (LTO) affect GC?  I fear it may
require the addition more RB_GC_GUARDs.  Fortunately nobody uses
LTO by default (it is very slow to link/optimize).
18813f71506ebad74179bf8c5a136696?d=identicon&s=25 Eric Wong (Guest)
on 2014-02-15 02:53
(Received via mailing list)
array.c: volatile in rb_ary_each along with the explanation in r32201 is
scaring me :x  If that volatile is needed, then it would also be
necessary in similar functions such as rb_ary_each_with_index and
rb_reverse_each.  Of course, those functions do not use volatile and
seem fine after all these years.

Anways, with r32201, I can understand the need for volatile in cont.c
because of setjmp.  Also, vm_call_cfunc no longer uses volatile, but
its use in rb_ary_each is troubling...

Anyways, I cannot find problems with
Debian clang version 3.0-6.2 (tags/RELEASE_30/final) (based on LLVM 3.0)
18813f71506ebad74179bf8c5a136696?d=identicon&s=25 Eric Wong (Guest)
on 2014-02-16 05:48
(Received via mailing list)
Eric Wong <normalperson@yhbt.net> wrote:
> array.c: volatile in rb_ary_each along with the explanation in r32201 is
> scaring me :x  If that volatile is needed, then it would also be
> necessary in similar functions such as rb_ary_each_with_index and
> rb_reverse_each.  Of course, those functions do not use volatile and
> seem fine after all these years.

I suspect it is because rb_ary_each is in the public C API,
so a guard becomes necessary.

--- a/array.c
+++ b/array.c
@@ -1784,16 +1784,21 @@ ary_enum_length(VALUE ary, VALUE args, VALUE
eobj)
  */

 VALUE
-rb_ary_each(VALUE array)
+rb_ary_each(VALUE ary)
 {
     long i;
-    volatile VALUE ary = array;

     RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
     for (i=0; i<RARRAY_LEN(ary); i++) {
   rb_yield(RARRAY_AREF(ary, i));
     }
-    return ary;
+
+    /*
+     * GC guard may be necessary if rb_ary_each is called like this:
+     *   rb_ary_each(rb_ary_new());
+     * (ary argument is allocated inline and the return value ignored)
+     */
+    return RB_GC_GUARD(ary);
 }

 /*
This topic is locked and can not be replied to.