C => Ruby plus TCP serialization using Marshal.dump/load

Hey,

We are writing an application that gets data from a C library, passes
it through our Ruby framework and over the network, and then sends
those data back down into the C library. We don’t actually want to
manipulate the data in Ruby, but we do want to be able to store them in
an opaque blob and have them marshal correctly and arrive intact. It is
also worth noting that we are trying to produce a generic framework, so
we want to make no assumptions about the sort of data we might be
getting from C.

What is the best way to do this? We tried the naive approach, using
Data_Wrap_Struct to pass the data into Ruby and then Data_Get_Struct to
move the data back down to C, but then when we tried to serialize it
using Marshal.dump, it failed (no marshal_dump is defined for class
Object).

Is there a relatively painless way to wrap data and pass them to Ruby
in such a way that these data can then be serialized over the network?

Thanks,
Nathan

Nathan B. wrote:

/ …

Is there a relatively painless way to wrap data and pass them to Ruby
in such a way that these data can then be serialized over the network?

One option that sacrifices some speed in exchange for robustness is to
wrap
arbitrary binary data in a CDATA block within an XML data stream.

This requires that the data be parsed, and it is slower than some other
approaches, but it plays nice with a lot of the present infrastructure
that
understands XML database content.

I assume you are speaking of a data block with no particular structure

data that might be a raw binary stream. If the data could be expressed
in
plain-text, I would suggest something different.

Nathan B. wrote:

What is the best way to do this? We tried the naive approach, using
Data_Wrap_Struct to pass the data into Ruby and then Data_Get_Struct to
move the data back down to C, but then when we tried to serialize it
using Marshal.dump, it failed (no marshal_dump is defined for class
Object).

Is there a relatively painless way to wrap data and pass them to Ruby
in such a way that these data can then be serialized over the network?

If it’s an opaque blob, then it sounds like String is the best class to
hold your data. The C->ruby part is just rb_str_new() (or rb_str_new2(),
if the data is null terminated). Any reason this won’t work in your
case?

On Wed, 13 Dec 2006, Nathan B. wrote:

Nathan
i think you really want to be using mmap. that way there is no
‘passing’.
the interface to mmap is as a string, so joel’s comments there apply.
mmap is
avialible on nix or windoze.

cheers.

-a

On 12/12/06, Nathan B. [email protected] wrote:

Paul-

You are correct–these data can be considered arbitrary binary data
(I’m thinking along the lines of C structs and pointers).

Does this approach handle the pointer issue? We were concerned that a
malloc’d array of, say, int pointers would serialize as a reference
rather than the actual array of ints.

Pointers are not going to serialize very well. The problem is that the
machine on the receiving end most likely will not have the exact same
memory space as the machine that did the sending. A pointer on the
sender would not necessarily be a valid pointer on the receiver.

One option is to convert your C structures into Ruby Struct objects.
Those can be marshalled over a network connection. Again, special care
will need to be taken with pointers values.

If you can get all your data to map to Struct objects and fundamental
Ruby data types (Float, Integer, String), then the Marshal mechanism
will take care of endian-ness issues for you.

“For every complex problem there is a solution that is simple, neat, and
wrong.”
– H. L. Mencken

Blessings,
TwP

On 12/12/06, [email protected] [email protected] wrote:

we want to make no assumptions about the sort of data we might be

Thanks,
Nathan

i think you really want to be using mmap. that way there is no ‘passing’.
the interface to mmap is as a string, so joel’s comments there apply. mmap is
avialible on nix or windoze.

Hmmm … are you suggesting that he memory map all the C data to a
file and then send the file over the network to the receiving machine?

TwP

Paul-

You are correct–these data can be considered arbitrary binary data
(I’m thinking along the lines of C structs and pointers).

Does this approach handle the pointer issue? We were concerned that a
malloc’d array of, say, int pointers would serialize as a reference
rather than the actual array of ints.

Thanks for the quick reply!

Nathan

Guys,

Thanks for all the good responses. A few items:

  • We are aware of the struct option. Our original approach was to
    require all parameters to be ruby data types or well-known structs,
    which could be wrapped and passed to Ruby. However, there is still the
    pointer issue–since the struct might be arbitrary, it might contain
    something like an int** (or even a void*) which cannot be wrapped and
    serialized appropriately (as Tim pointed out, we want to make no
    assumptions regarding the memory state of other connected nodes).
  • How are you suggesting we use mmap to solve this problem? If there is
    a way that we could just take all the data in one big binary chunk,
    represent it with a string, and send it over the wire, that would be
    excellent. However, again it seems to me that pointers provide a
    problem.

Thanks for the suggestions. But I’m still not sure how this solves our
problem in the general sense (obviously if we were just trying to
support a single piece of C code then this problem could be solved more
specifically). Code examples for the C side would be especially
appreciated, since we’re less familiar with that aspect.

Much appreciated,
Nathan

On Wed, 13 Dec 2006, Tim P. wrote:

also worth noting that we are trying to produce a generic framework, so
in such a way that these data can then be serialized over the network?
Hmmm … are you suggesting that he memory map all the C data to a
file and then send the file over the network to the receiving machine?

TwP

just suggesting using mmap to pass the data between c and ruby using
MAP_SHARED. easy from c the, in ruby, you can do

string = mmap.to_s

socket.write [string.size].pack(‘N’)
socket.write string

regards.

-a

On Wed, 13 Dec 2006, Nathan B. wrote:

  • How are you suggesting we use mmap to solve this problem? If there is a
    way that we could just take all the data in one big binary chunk, represent
    it with a string, and send it over the wire, that would be excellent.
    However, again it seems to me that pointers provide a problem.

the is no automatic solution for c serializtion. you have to code that
your
self. you have to setup code to marshal and un-marshal your object and
it has
to be custom. the data send part, however, can be dealt with fairly
easily.

Thanks for the suggestions. But I’m still not sure how this solves our
problem in the general sense (obviously if we were just trying to support a
single piece of C code then this problem could be solved more specifically).
Code examples for the C side would be especially appreciated, since we’re
less familiar with that aspect.

contact me offline - i’d post an example but i’m thinking 100 lines of c
code
is a bit off-topic!

kind regards.

-a

[email protected] wrote:

your
self. you have to setup code to marshal and un-marshal your object and
it has
to be custom. the data send part, however, can be dealt with fairly
easily.

You can use the ASN.1 compilers to create serializers and deserializers
for your C data. The open source ASN.1 compiler at
http://lionet.info/asn1c can serialize C stuff into XML, which is
probably the easiest way for ruby to subsequently handle. Pointers,
arrays and other data structures are supported.

On Fri, 15 Dec 2006, Lev Walkin wrote:

the is no automatic solution for c serializtion. you have to code that
supported.
thanks for the link!

-a