Question on networking with custom binary interface

So I am working on this Ruby server application for Windows that needs
to communicate with a client. Standard TCP sockets are being used, but
the data being passed is kind of weird. The problem is mostly that the
program I am communicating with on the client is a C++ program I cannot
modify. However, there is a binary interface I have the documentation
for that if I can make all data passed match the documentation, then it
should work.

My main concern is that the documentation is VERY specific. As in the
first four bytes has to be a unsigned long, the next two bytes has to be
a unsigned short, and the next is a unsigned char. Then some places in
the interface allow for a string of length N and the only guarantee is
the string will be null terminated to determine the end of it. And then
there are some odd data types used like this one four byte data type
only referred to as “DWORD” and this eight byte data type called
“FILETIME.dwLowDateTime(DWORD)”.

Any advice for doing this type of binary IO?

Greg C. wrote:

Any advice for doing this type of binary IO?

Array#pack, String#unpack are your friends. See also

http://redshift.sourceforge.net/bit-struct/
http://rubyforge.org/projects/binaryparse/

I don’t have anything to add to Joel’s excellent reply, but one thing
caught my eye:

On 30.06.2009 00:08, Greg C. wrote:

My main concern is that the documentation is VERY specific.

That sounds odd to me. Most of the time people complain that there is
no documentation or that it’s inaccurate. Apparently you got
documentation leaving no questions and are concerned. This is the best
that could happen to you in this situation. Why are you concerned?

Kind regards

robert

PS: One additional heads up: when encoding and decoding numbers pay
special attention to byte ordering (big endian, little endian).

Robert K. wrote:

I don’t have anything to add to Joel’s excellent reply, but one thing
caught my eye:

On 30.06.2009 00:08, Greg C. wrote:

My main concern is that the documentation is VERY specific.

That sounds odd to me. Most of the time people complain that there is
no documentation or that it’s inaccurate. Apparently you got
documentation leaving no questions and are concerned. This is the best
that could happen to you in this situation. Why are you concerned?

Kind regards

robert

PS: One additional heads up: when encoding and decoding numbers pay
special attention to byte ordering (big endian, little endian).

Well my concern was since this was specific documentation intended for
C++ programs, I was worried about how Ruby would attempt to handle this
under the hood. I mean, I would not even worry if the client program
was also a Ruby program because I could be sure that one Ruby program
would know what another Ruby program is saying, but I couldn’t be sure
that some proprietary C++ program would know what a Ruby program was
trying to say. Mostly I was worried about compatibility of data between
programs. That is why I was worried about how specific the
documentation was. Otherwise it is greatly appreciated.

Also, for a quick switch in topic, thanks you guys for your help. You
are really helping an intern out.

2009/6/30 Greg C. [email protected]:

that could happen to you in this situation. Why are you concerned?
Well my concern was since this was specific documentation intended for
C++ programs, I was worried about how Ruby would attempt to handle this
under the hood. I mean, I would not even worry if the client program
was also a Ruby program because I could be sure that one Ruby program
would know what another Ruby program is saying, but I couldn’t be sure
that some proprietary C++ program would know what a Ruby program was
trying to say. Mostly I was worried about compatibility of data between
programs. That is why I was worried about how specific the
documentation was. Otherwise it is greatly appreciated.

Well, over the network it’s just bytes and I haven’t heard yet that
Ruby bytes are any different from C++ bytes. :slight_smile: The only tricky part
is to get the ordering of bytes right. See also:

http://en.wikipedia.org/wiki/Network_byte_order#Endianness_in_networking

Also, for a quick switch in topic, thanks you guys for your help. You
are really helping an intern out.

You’re welcome!

Kind regards

robert

2009/6/30 Greg C. [email protected]:

this?
irb(main):004:0> s = (1…8).map {|x| x.chr}.join
=> “\x01\x02\x03\x04\x05\x06\a\b”
irb(main):005:0> x = s.unpack “Q”
=> [578437695752307201]
irb(main):006:0> x.first.class
=> Bignum

If you need more flexibility you can do something like this

irb(main):009:0> s.unpack(“N2”).inject(0) {|s,a| s << 32 | a}
=> 72623859790382856
irb(main):010:0> s.unpack(“L2”).inject(0) {|s,a| s << 32 | a}
=> 289077004534744581

Note, I did not bother to check byte orderings which you can easily
guess from the different results. This is just an illustration to
give you an idea.

Kind regards

robert

Joel VanderWerf wrote:

Greg C. wrote:

Any advice for doing this type of binary IO?

Array#pack, String#unpack are your friends.

I am looking at the documentation of these methods and this covers a LOT
of what I need to do, but I am worried in a couple areas. For example,
part of the binary string I will be getting will contain an eight byte
unsigned long long. How should I approach custom byte structures like
this?

Greg C. wrote:

Joel VanderWerf wrote:

Greg C. wrote:

Any advice for doing this type of binary IO?
Array#pack, String#unpack are your friends.

I am looking at the documentation of these methods and this covers a LOT
of what I need to do, but I am worried in a couple areas. For example,
part of the binary string I will be getting will contain an eight byte
unsigned long long. How should I approach custom byte structures like
this?

Ok, Array#pack, String#unpack are your relatives. BitStruct is your
friend.

require ‘bit-struct’

class MyPacket < BitStruct
unsigned :x, 88, “The x field”, :endian => :network
# :network is the default, and it’s the same as :big
unsigned :y, 8
8, “The y field”, :endian => :little
end

pkt = MyPacket.new
pkt.x = 59843759843759843
pkt.y = 59843759843759843

p pkt.x # 59843759843759843
p pkt.y # 59843759843759843

p pkt.inspect

“#”

p pkt.to_s.inspect

“”\000\324\233\225\037\202\326\343\343\326\202\037\225\233\324\000""

puts pkt.inspect_detailed

MyPacket:

The x field = 59843759843759843

The y field = 59843759843759843

puts MyPacket.describe

byte: type name [size] description

----------------------------------------------------------------------

@0: unsigned x [ 8B] The x field

@8: unsigned y [ 8B] The y field

Oops, there was an extra layer of inspection in two of the output lines.
Fixed:

require ‘bit-struct’

class MyPacket < BitStruct
unsigned :x, 88, “The x field”, :endian => :network
# :network is the default, and it’s the same as :big
unsigned :y, 8
8, “The y field”, :endian => :little
end

pkt = MyPacket.new
pkt.x = 59843759843759843
pkt.y = 59843759843759843

p pkt.x # 59843759843759843
p pkt.y # 59843759843759843

p pkt

#

p pkt.to_s

“\000\324\233\225\037\202\326\343\343\326\202\037\225\233\324\000”

puts pkt.inspect_detailed

MyPacket:

The x field = 59843759843759843

The y field = 59843759843759843

puts MyPacket.describe

byte: type name [size] description

----------------------------------------------------------------------

@0: unsigned x [ 8B] The x field

@8: unsigned y [ 8B] The y field

Sorry to bump an old topic of mine, but I got sidetracked with work from
a different project and had one more question.

Joel VanderWerf wrote:

insert example code here

I’ve been using BitStruct and it works like a dream except in one area,
which is with strings. My first test class I am making keeps it easy by
putting it at the end of the packet so I can just use a rest field. And
although the output returned by description looks fine, when I convert
to a String for sending, the string is added on as is instead of
converted like all the other data. Here was the test code I wrote:

test_ascii_message.rb

require ‘bit-struct’

class TestAsciiMessage < BitStruct
unsigned :message_length, 48, “Message Length”, :endian => :network
unsigned :message_id, 2
8, “Unique Message ID”, :endian => :network
unsigned :service_id, 18, “Service Identifier”, :endian => :network
unsigned :event_id, 1
8, “Event Identifier”, :endian => :network
rest :text_msg, “Null terminated ASCII Message String”
end

main.rb

require ‘test_ascii_message’

test_packet = TestAsciiMessage.new
test_packet.message_id = 0
test_packet.service_id = 2
test_packet.event_id = 3
test_packet.text_msg = “Testing 01 10 11”
test_packet.message_length = test_packet.length
puts “-”*75
puts test_packet.inspect
puts “-”*75
puts test_packet.inspect_detailed
puts “-”*75
puts TestAsciiMessage.describe
puts “-”*75
p test_packet.to_s

And the resulting output was this:


#

TestAsciiMessage:
Message Length = 24
Unique Message ID = 0
Service Identifier = 2
Event Identifier = 3
Null terminated ASCII Message String = “Testing 01 10 11”

byte: type         name          [size] description

  @0: unsigned     message_length[ 32b] Message Length
  @4: unsigned     message_id    [ 16b] Unique Message ID
  @6: unsigned     service_id    [  8b] Service Identifier
  @7: unsigned     event_id      [  8b] Event Identifier

“\000\000\000\030\000\000\002\003Testing 01 10 11”

This output is fine until it gets to the string. I just worry, like I
did above, because the program receiving this needs to see the string as
a null terminated *char string… did I say that right? Don’t know
enough C for this.

Anyways, that was my first question. The second question is how would I
go about converting a binary string I received into a BitStruct when
there is a variable length, null terminated string in the center of the
binary string? Figured I would ask since using a rest field won’t work
here.

Greg C. wrote:

I’ve been using BitStruct and it works like a dream except in one area,
which is with strings. My first test class I am making keeps it easy by
putting it at the end of the packet so I can just use a rest field. And
although the output returned by description looks fine, when I convert
to a String for sending, the string is added on as is instead of
converted like all the other data. Here was the test code I wrote:

“\000\000\000\030\000\000\002\003Testing 01 10 11”

This output is fine until it gets to the string. I just worry, like I
did above, because the program receiving this needs to see the string as
a null terminated *char string… did I say that right? Don’t know
enough C for this.

I see your point. You can of course add the null manually:

test_packet.text_msg = “Testing 01 10 11\0”

but that’s kinda defeating the purpose of bit-struct.

Another thing you can do is define reader/writer methods that
strip/append the null. Rename the field to _text_msg and define:

def text_msg; _text_msg[/(.*)\0\z/, 1]; end
def text_msg=(s); self._text_msg = “#{s}\0”; end

It would be easy to add this to bit-struct, I’m just not sure what the
api should be. Maybe an option like

rest :text_str, :terminator => “\0”

Thoughts?

Anyways, that was my first question. The second question is how would I
go about converting a binary string I received into a BitStruct when
there is a variable length, null terminated string in the center of the
binary string? Figured I would ask since using a rest field won’t work
here.

Is the field fixed length, even if the string itself is variable length?
If so, you can use a #text field, which pads with null chars:

require ‘bit-struct’

class Msg < BitStruct
unsigned :x, 16
text :str, 10*8
unsigned :y, 16
end

m = Msg.new(:x => 1, :str => “foo”, :y => 2)
p m
p m.to_s

END

#<Msg x=1, str=“foo”, y=2>
“\000\001foo\000\000\000\000\000\000\000\000\002”

Thanks for help of the first question. As for the second question, no
the field is not fixed length. The only guarantees are that the
string is the only variable length field, it is null terminated, it’s
start byte is known, and the length of the entire packet is known.
Any ideas?

Greg C. wrote:

Thanks for help of the first question. As for the second question, no
the field is not fixed length. The only guarantees are that the
string is the only variable length field, it is null terminated, it’s
start byte is known, and the length of the entire packet is known.
Any ideas?

Was afraid of that… it’s kind of a tricky case. The only way to read a
field after the variable length field is to scan to the null terminator
and then jump from that to a known offset, right? And it gets worse with
each such field. So the accessor-based approach of bit-struct becomes
complicated and inefficient. You might be better of pre-parsing the data
into blocks of fixed and variable length fields. Then use BitStruct
classes to parse the former. Wrap your own class (even just an array
with positional accessors) around this and it could be fairly nice.

Ara Howard had a suggestion along those lines, using pack/unpack instead
of bit-struct, in:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/152096

That example wasn’t for variable length fields, but it would apply. The
key point is that instead of leaving the data as a string and accessing
fields as substrings (like bit-struct does), you parse it into an array,
access the entries in the array, and write it back to a string when
needed.

Maybe a hybrid would work for you… parse into an array of BitStructs
and variable length Strings.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs