Forum: Ruby DANGER ! Ruby-Newbie ahead: How to access binary files

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Cdb3ec23d3ee6366dcffe49150a0a9e9?d=identicon&s=25 Meino Christian Cramer (Guest)
on 2006-01-16 17:46
(Received via mailing list)
Hi,

 There is a file on my HD, which was written by a C program. The C
 program wrotes the contents of an array of structures (each array
 element was made from the same structure) to that file.

 Since accessing that file looks like a very low level and
 "procedure-based" thing to me I would be very interested how this job
 can be done in a most ruby-like, objectoriented way.

 Thank you very much for any help in advance!
 Dont worry, use ruby!
 mcc
31ab75f7ddda241830659630746cdd3a?d=identicon&s=25 Austin Ziegler (Guest)
on 2006-01-16 17:49
(Received via mailing list)
On 16/01/06, Meino Christian Cramer <Meino.Cramer@gmx.de> wrote:
>  There is a file on my HD, which was written by a C program. The C
>  program wrotes the contents of an array of structures (each array
>  element was made from the same structure) to that file.
>
>  Since accessing that file looks like a very low level and
>  "procedure-based" thing to me I would be very interested how this job
>  can be done in a most ruby-like, objectoriented way.

If you're using Windows, make sure you open the file in binary mode:

  File.open(filename, "rb") ...

Otherwise, look up Array#unpack. There are examples of this in the
ImageInfo library that is on the RAA; I have a custom copy of it in
PDF::Writer.

-austin
123320fdc17940dfc8e365edb48fbff2?d=identicon&s=25 Bob Showalter (Guest)
on 2006-01-16 17:52
(Received via mailing list)
Meino Christian Cramer wrote:
> Hi,
>
>  There is a file on my HD, which was written by a C program. The C
>  program wrotes the contents of an array of structures (each array
>  element was made from the same structure) to that file.
>
>  Since accessing that file looks like a very low level and
>  "procedure-based" thing to me I would be very interested how this job
>  can be done in a most ruby-like, objectoriented way.

You need to know the format of the structure and its size.

You can then use IO#read to read bytes from the file into a string and
String#unpack to extract the individual fields from the structure.

Of course, all of this should be encapsulated into a class :-)
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-01-16 19:14
(Received via mailing list)
On Tue, 17 Jan 2006, Meino Christian Cramer wrote:

> There is a file on my HD, which was written by a C program. The C program
> wrotes the contents of an array of structures (each array element was made
> from the same structure) to that file.
>
> Since accessing that file looks like a very low level and "procedure-based"
> thing to me I would be very interested how this job can be done in a most
> ruby-like, objectoriented way.

this is a perfect use case to abstract the error prone method of
reading/seeking/writing that one would typically do with binary data.  i
use
mmap alot for these types of tasks at work, here is a little (silly)
example:


first we build a c program to output an array of struct.  note that we
output
the sizeof(struct) as the first part of the file - this is because we
can't
know how the compiler will pad structs so we make sure the correct size
is
encoded into the file:

     harp:~ > cat a.c
     #include <stdlib.h>
     #include <stdio.h>

     struct foobar { int foo; float bar; };

     main ()
     {
       struct foobar a[] = { {40, 40.0}, {2, 2.0} };
       int size = sizeof(struct foobar);
       fwrite (&size, sizeof(int), 1, stdout);
       fwrite (&a, sizeof(a), 1, stdout);
     }

     harp:~ > gcc a.c

     harp:~ > a.out > a


next we write a ruby class to access the data.  the access will be via
mmap, so
any changes we make to the data can be tranparently written to disk with
no
explicit io on our part - we simply use the objects as normal:

     harp:~ > cat a.rb
     #! /usr/bin/env ruby
     require "mmap"  # ftp://moulon.inra.fr/pub/ruby/

     class Integer
       SIZEOF = [42].pack("i").size
     end
     class Float
       SIZEOF = [42.0].pack("f").size
     end
     module Foobar
       class Struct
         def initialize mmap, offset
           @mmap, @offset = mmap, offset
         end
         def foo
           @mmap[@offset, Integer::SIZEOF].unpack("i").first
         end
         def foo= i
           @mmap[@offset, Integer::SIZEOF] = [Integer(i)].pack("i")
         end
         def bar
           @mmap[@offset + Integer::SIZEOF,
Float::SIZEOF].unpack("f").first
         end
         def bar= f
           @mmap[@offset + Integer::SIZEOF, Float::SIZEOF] =
[Float(f)].pack("f")
         end
         def inspect
           { "foo" => foo, "bar" => bar }.inspect
         end
       end
       class List < ::Array
         def initialize mmap
           @mmap = mmap
           @sizeof = mmap[0, Integer::SIZEOF].unpack("i").first
           offset = Integer::SIZEOF
           while((offset + @sizeof) <= mmap.size)
             struct = Struct::new @mmap, offset
             self << struct
             offset += @sizeof
           end
         end
       end
       class File
         attr "path"
         attr "list"
         attr "mmap"
         def initialize path
           @path = path
           open(@path, "r+"){|f| @mmap = Mmap::new f, "rw",
Mmap::MAP_SHARED}
           @list = List::new @mmap
         end
         def self::new *a, &b
           ff = super
           mmap = ff.mmap
           ::ObjectSpace::define_finalizer(ff){ mmap.msync; mmap.munmap
}
           ff
         end
       end
     end

     ff = Foobar::File::new ARGV.shift
     fl = ff.list

     p fl

     fl.each{|f| f.foo = 42 and f.bar = 42.0}  # automatically written!



the first time we run the progam we see the data the c program wrote:

     harp:~ > a.rb a
     [{"foo"=>40, "bar"=>40.0}, {"foo"=>2, "bar"=>2.0}]


but next time we see the data automatically written by the ruby program:


     harp:~ > a.rb a
     [{"foo"=>42, "bar"=>42.0}, {"foo"=>42, "bar"=>42.0}]


this is just a silly example, but it shows how objectification of
something
like this might be done in a way that really makes working with the
actual data
easier.

kind regards.

-a
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2006-01-17 02:16
(Received via mailing list)
ara.t.howard@noaa.gov wrote:

>         def bar
>           @mmap[@offset + Integer::SIZEOF, Float::SIZEOF].unpack("f").first
>         end
>         def bar= f
>           @mmap[@offset + Integer::SIZEOF, Float::SIZEOF] =
> [Float(f)].pack("f")
>         end

Doesn't this arithmetic assume that the C compiler is packing the fields
of the struct? What if fields are aligned on 8 byte boundaries, for
instance? I vaguely remember having some issues like this when porting
from x86 to sparc. I guess you could add __attribute__((__packed__)) to
the struct to be sure.
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-01-17 02:31
(Received via mailing list)
On Tue, 17 Jan 2006, Joel VanderWerf wrote:

> Doesn't this arithmetic assume that the C compiler is packing the fields
> of the struct? What if fields are aligned on 8 byte boundaries, for
> instance? I vaguely remember having some issues like this when porting
> from x86 to sparc. I guess you could add __attribute__((__packed__)) to
> the struct to be sure.

absolutely.  i figured it was beyond the scope of the post to get into
that -
but really the file format would need to export the shape of the struct
in
some sort of header.  to do this one would need to crawl the struct with
a
'void *' and compute offsets from the address of the struct.

of course, this would about the point where one should pull out xdr or
some
such.  in practice, however, one often needs to read binary data written
by a
program beyond one's control and the unpack approach will work most of
the
time - wouldn't launch rockets with it though!

cheers.

-a
This topic is locked and can not be replied to.