Binary file modification

Hi

I’ve been modifying a binary file that contains various data including
some audio. I’m trying to add my own audio (instead of the audio in the
file) which I believe I have done but I need to modify the
content-length header in the file which indicates the length of the
audio sample. I’ve been reading the content-length data from the file
using something similar to :

f=open(config[:file],“rb”)
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack(“H2H2H2”).hex.to_i
f.close
=> an integer

This basically opens the file as a binary file, skips to the audio
content header data position and then unpacks 3 bytes into a string
(formatted as hex) which is then converted to an integer value.

I’d like to be able to reverse this process and take any an integer
value (1024 in the case shown below) and write it to a binary file with
some header and footer data - something like :

mydata = SOMEHEADERDATA
mydata += [“1024”].pack(“someformat”)
mydata += AUDIODATA
f=open(config[:file],“wb”)
f.write(mydata)
f.close

However I’m a bit stuck on how to pack the data (if this is the correct
solution). If someone could point me in the correct direction it would
be much appreciated.

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Rob L.
[…]
I’ve been reading the content-length data from
the file
using something similar to :

f=open(config[:file],“rb”)
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack(“H2H2H2”).hex.to_i

That should die because Array#hex doesn’t exist, but I get the idea. If
you
do this kind of thing a lot with different kinds of data then you should
note that to_i and to_s both take optional base arguments, so
foo.unpack(‘H*’).first.to_i(16) should work

[…]

I’d like to be able to reverse this process and take any an integer
value (1024 in the case shown below) and write it to a binary
file with
some header and footer data - something like :
[…]
mydata += [“1024”].pack(“someformat”)
[…]
However I’m a bit stuck on how to pack the data (if this is
the correct
solution).

Here’s the problem - if you need exactly three bytes then you are going
to
have to apply your own padding, since the pack routines will only pack
directly as a long or a short which will mostly be 4 and 2 bytes - both
of
which could cause you problems. My hacks always involve <<'ing a single
byte
integer onto a string. In your case, here is a horrible oneliner which
you
should not use because it is gross.

num=1024
num.to_s(16)[0…2].instance_eval {(self.reverse + ‘0’ * (6 -
self.length)).reverse}.scan(/…/).inject(‘’) {|s,byte| s <<
byte.to_i(16)}

You could also try googling BitStruct, which is a ruby library that
might
help by defining these headers and footers as structure objects.

Cheers,

ben

Rob L. wrote:

f.pos=AUDIO_CONTENT_HEADER_OFFSET
some header and footer data - something like :
be much appreciated.
Ben’s right, you can use BitStruct for this:

require ‘bit-struct’

class AudioData < BitStruct

unsigned :audio_length, 3*8, :endian => :little
rest :data

Note: don’t use :length as the name of a field, because it will

conflict with the #length method inherited from String.

the :endian option can also be :big, :network (== :big), or :native

end

audio_data = AudioData.new

data = “foo bar baz”
audio_data.data = data
audio_data.audio_length = data.length

p audio_data
p audio_data.to_s

END

Output:

#
“\v\000\000foo bar baz”

On Mon, 23 Oct 2006, Rob L. wrote:

Hi

I’ve been modifying a binary file that contains various data including
some audio. I’m trying to add my own audio (instead of the audio in the
file) which I believe I have done but I need to modify the

You’ve had other answers that answer the question as put. Another
approach that may be useful is to look at SNG, a textual means of
accessing the contents of PNG graphics:

http://www.faqs.org/docs/artu/ch06s01.html#id2910193

    HTH
    Hugh

Ben N. wrote:

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Rob L.
[…]
I’ve been reading the content-length data from
the file
using something similar to :

f=open(config[:file],“rb”)
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack(“H2H2H2”).hex.to_i

That should die because Array#hex doesn’t exist, but I get the idea. If
you
do this kind of thing a lot with different kinds of data then you should
note that to_i and to_s both take optional base arguments, so
foo.unpack(‘H*’).first.to_i(16) should work

Thanks for pointing that out, it’s strange as the code definately works
(it returns correct content lengths repeatedly for different files) but
as you say Array#hex doesn’t exist - I’ll look into this …

[…]

I’d like to be able to reverse this process and take any an integer
value (1024 in the case shown below) and write it to a binary
file with
some header and footer data - something like :
[…]
mydata += [“1024”].pack(“someformat”)
[…]
However I’m a bit stuck on how to pack the data (if this is
the correct
solution).

Here’s the problem - if you need exactly three bytes then you are going
to
have to apply your own padding, since the pack routines will only pack
directly as a long or a short which will mostly be 4 and 2 bytes - both
of
which could cause you problems. My hacks always involve <<'ing a single
byte
integer onto a string. In your case, here is a horrible oneliner which
you
should not use because it is gross.

num=1024
num.to_s(16)[0…2].instance_eval {(self.reverse + ‘0’ * (6 -
self.length)).reverse}.scan(/…/).inject(‘’) {|s,byte| s <<
byte.to_i(16)}

I’m intrigued by this, I can see you are repacking the int value into
three bytes but I’m still unclear as to how you would then add this to
the file. I guess it needs to be packed as 3 bytes - my guess was

num=1024
newpacket += [num].pack(“3C”)

However this doesn’t seem to work using the reading scheme above. Could
you tell me where I’m going wrong ?

You could also try googling BitStruct, which is a ruby library that
might
help by defining these headers and footers as structure objects.

I’m looking at using bitstruct for the next version of the project, but
as I’ve almost got this working now I’d like to avoid a re-write at this
stage if possible !

Cheers,

ben

Thanks for all the help - you’ve saved me hours of work already !

Thanks for pointing that out, it’s strange as the code definately works
(it returns correct content lengths repeatedly for different files) but
as you say Array#hex doesn’t exist - I’ll look into this …

A quick reply to myself - I’ve just checked the code and I’m missing an
Array#join call in the above example, so the example wasn’t correct …

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Rob L.
Sent: Monday, October 23, 2006 6:06 PM
To: ruby-talk ML
Subject: Re: Binary file modification

Ben N. wrote:
[…]

num=1024
num.to_s(16)[0…2].instance_eval {(self.reverse + ‘0’ * (6 -
self.length)).reverse}.scan(/…/).inject(‘’) {|s,byte| s <<
byte.to_i(16)}

I’m intrigued by this, I can see you are repacking the int value into
three bytes but I’m still unclear as to how you would then
add this to
the file.

It’s already packed. The return value of the long expression will be a 3
byte string which you can insert as you like (print it to the IO handle,
<<
it to another string etc etc). The accumulator in the inject method is a
string {|str,byte| … might have been more readable, sorry.

ben

It does seem to work for small values like 1024 but not larger ones. I
know there will be an upper limit to the integer size I can fit into the
three bytes but the 359447 value is one I’ve taken from an existing file
so I believe this is a valid integer to try and pack into these three
bytes.

Any advice ?

Some more information on this - the maximum value that can be
successfully packed is 4095

Any advice ?

It’s already packed. The return value of the long expression will be a 3
byte string which you can insert as you like (print it to the IO handle,
<<
it to another string etc etc). The accumulator in the inject method is a
string {|str,byte| … might have been more readable, sorry.

ben

Sorry I should have studied it a little closer …

I’ve tried the following and your re-packing doesn’t seem to work :

f = open(“testfile”,“wb”)
num=359447
f.write(num.to_s(16)[0…2].instance_eval {(self.reverse + ‘0’ * (6 -
self.length)).reverse}.scan(/…/).inject(’’) {|s,byte| s <<
byte.to_i(16)})
f.close
f = open(“testfile”,“rb”)
f.pos=0
puts f.read(3).unpack(‘H*’).first.to_i(16)
f.close

=> 1404

It does seem to work for small values like 1024 but not larger ones. I
know there will be an upper limit to the integer size I can fit into the
three bytes but the 359447 value is one I’ve taken from an existing file
so I believe this is a valid integer to try and pack into these three
bytes.

Any advice ?

Joel VanderWerf wrote:

Ben’s right, you can use BitStruct for this:

Forgot the link:

http://raa.ruby-lang.org/project/bit-struct/