What is the fastest way to iterate over a hash in C?

Hello,

I’m working on the Fast JSON project, and have come against some
puzzling
performance quirk.

Most of the Hash#to_json functionality is implemented in C and performs
much
better. However there is one section that performs 6 time better when
implemented in ruby vs c.

I wrote a benchmark that calls to_json on 50000 hashes.

Here is the method in Ruby. The benchmark takes around 1.7 seconds:

def process_internal_json(json, state, depth, delim)
first = true
each { |key,value|
if first
first = false
else
json << delim
end
generate_key_value_json(json, state, depth, key, value)
}
json
end

Here is the method in C. The benchmark takes around 9.5 seconds:

static VALUE process_internal_json(VALUE self, VALUE json, VALUE state,
VALUE depth, VALUE delim) {
int first = 1;
VALUE key_value_pairs = rb_funcall(self, rb_intern(“to_a”), 0);

VALUE key_value = Qnil;
while((key_value = rb_ary_pop(key_value_pairs)) != Qnil) {
if(first == 1) {
first = 0;
}
else {
rb_str_concat(json, delim);
}
VALUE value = rb_ary_pop(key_value);
VALUE key = rb_ary_pop(key_value);
generate_key_value_json(self, json, state, depth, key, value);
}
}

It seems like there is some optimization in the Hash#each method. I’m
trying
to figure out how to get that same performance benefit using C. Perhaps
its
is not worth it though.

Does anybody know what going on?

Thank you,
Brian T.

I found a better solution in c. This method causes the benchmark to run
in
about 1.4 seconds.

static VALUE process_internal_json(VALUE self, VALUE json, VALUE state,
VALUE depth, VALUE delim) {
VALUE key_value_pairs = rb_funcall(self, rb_intern(“to_a”), 0);

VALUE key_value = Qnil;
int i;
for( i = 0; i < RARRAY(key_value_pairs)->len; i++) {
if(i > 0) {
rb_str_concat(json, delim);
}
VALUE key_value = rb_ary_entry(key_value_pairs, i);
VALUE key = rb_ary_entry(key_value, 0);
VALUE value = rb_ary_entry(key_value, 1);
generate_key_value_json(self, json, state, depth, key, value);
}
}