Ben N. wrote:
Have you thought about not using String as the base class? For instance,
OpenStruct would be almost OK for my purposes, if it sustained ordered
output. If I had to hack things up without guidance I would probably start
with a Hash and have :fieldname -> pos, val, type internally. You wouldn’t
be able to treat the whole object like a string directly, but overloading
to_s shouldn’t be too ugly syntactically? The type definition would still be
used to meta-create a class ‘parse’ method that does the parsing, to convert
from a raw string (or I guess you could just use o=Class.new(String)).
This is a good point (about String as the base class), and it brings up
the threshold at which bit-struct loses its usefulness. If you’re doing
a lot of complex accessor operations (esp. the var-length fields), then
operating on a string just gets hopelessly mucky. It’s better to use
some structured data type, and follow the parse->operate->unparse cycle.
BitStruct has been useful in cases where I only need to touch a field or
two and then just pass the string on somewhere else (a socket, a file, a
database, etc.). In these cases, parsing all the fields is a waste of
time.
So, what kind of data structure to use…
A hash of fieldname => [pos, val, type] has the disadvantage that each
field must know its position. If you increase the length of one field,
you have to search for all other fields with higher pos, and increase
their pos.
An array of values, with defined accessors plus #parse and #to_s
methods, is probably better. I think Ara Howard’s arrayfields lib might
be a place to start, and then you can implement #parse and #to_s using
#unpack and #pack. You don’t need to keep track of pos and update it
each time a field changes size, as long as each field knows its
(current) length. Don’t worry about actual offsets except in #to_s. Be
lazy.
With this approach, the accessors will be much more efficient than
BitStructs, but parse/to_s will be less efficient.
fields like length so it needs to be done last… ok my brain just exploded.
It’s probably better to compute the checksum in terms of the string
representation, rather than try to perform the calculation in terms of
individual field values (which may be in the wrong byte order, may have
too much precision, may need to be shifted into position in a bit field,
…).
I hope you find it worthwhile to work on a library like this.