Hi all,
I’m parsing a binary file, and need to read an integer, something I
would do in C like this:
int b;
read(f, &b, sizeof(int));
obviously considering endianness. I’m pretty sure there has to be a
faster way to do it, but this is how I’m doing it right now (as you
can see, pretty naive):
class IO
read int, assume little endian
def geti
c1 = getc
c2 = getc
c3 = getc
c4 = getc
c4 << 38 | c3 << 28 | c2 << 8 | c1
end
end
What would be the ruby-way to do it?
thanks for any tip…
On 10/12/06, Rolando A. [email protected] wrote:
What would be the ruby-way to do it?
thanks for any tip…
class IO
def geti( endian = :little )
str = self.read( 4 )
str = str.reverse if endian == :little
str.unpack( ‘N’ )[0]
end
end
The default for this method is to return the integer in little endian
byte order. You can change this by passing :big as an argument …
io.geti :big
It does not have to be :big, but I’m just following the metaphor of
using :little for little endian byte order.
Blessings,
TwP
On Fri, 2006-10-13 at 00:35 +0900, Rolando A. wrote:
Hi all,
I’m parsing a binary file, and need to read an integer,
Check out String#unpack:
(dta = File.read(‘ints.bin’)).unpack(‘I’ * (dta.length / 4))
=> [1234, 2345, 3456]
It has different type specifiers for endianness and so on. Also, if you
gotta crawl through the file, you can tell IO#read how many bytes you
want:
File.open(‘ints.bin’) { |f| puts f.read(4).unpack(‘I’) until f.eof? }
1234
2345
3456
=> nil
Thanks a lot, I’ll try that!!!
On 10/12/06, Rolando A. [email protected] wrote:
Thanks a lot, I’ll try that!!!
No problem. Take a look at bit-struct if you find yourself needing to
do some more complex packing and unpacking of binary data …
http://raa.ruby-lang.org/project/bit-struct/
Blessings,
TwP
On Fri, 13 Oct 2006, Tim P. wrote:
TwP
have you been using this for your stuff tim?
-a
On Fri, 13 Oct 2006, Tim P. wrote:
have you been using this for your stuff tim?
No, we’ve just been parsing very large pixel images. No complex data
structures. Read four bytes, mask off the hamming code and error
bits, store the pixel data in an mmap cache, repeat until EOF.
have you looked into using narray? then you can mask the entire image
at
once. i have code that turns an mmap into an narray - it’s quite
simple. got
a sample file?
-a
On 10/12/06, [email protected] [email protected] wrote:
have you been using this for your stuff tim?
No, we’ve just been parsing very large pixel images. No complex data
structures. Read four bytes, mask off the hamming code and error
bits, store the pixel data in an mmap cache, repeat until EOF.
From the bit-struct readme …
“BitStruct is most efficient when your data is primarily treated as a
binary string, and only secondarily treated as a data structure. (For
instance, you are routing packets from one socket to another, possibly
looking at one or two fields as it passes through or munging some
headers.) If accessor operations are a bottleneck, a better approach
is to define a class that wraps an array and uses pack/unpack when the
object needs to behave like a binary string.”
TwP
Tim P.:
On 10/12/06, Rolando A. [email protected] wrote:
int b;
read(f, &b, sizeof(int));
class IO
def geti( endian = :little )
str = self.read( 4 )
str = str.reverse if endian == :little
str.unpack( ‘N’ )[0]
end
end
Yet you cannot be sure that sizeof(int) is 4.
Kalman
On 10/12/06, Kalman N. [email protected] wrote:
Yet you cannot be sure that sizeof(int) is 4.
no, but in my case works just fine. it will always be a 4 byte unsigned
integer.
Kalman
regards,
Rolando A. wrote:
rolando – [[ knowledge is empty, fill it ]] –
“Tam pro papa quam pro rege bibunt omnes sine lege.”
Quicquid Venus imperat, labor est suavis…
Hal
Robert K.:
Kalman N. wrote:
Tim P.:
str = self.read( 4 )
Yet you cannot be sure that sizeof(int) is 4.
str = read( [0].pack(‘N’).length )
Hey, only now I learnt that sizeof(int) is 4 even on my amd64 machine. I
had to
check that with a C program to make me believe it.
Kalman
On 12.10.2006 18:10, Kalman N. wrote:
end
Yet you cannot be sure that sizeof(int) is 4.
Kalman
class IO
def geti( endian = :little )
str = read( [0].pack(‘N’).length )
str.reverse! if endian == :little
str.unpack( ‘N’ )[0]
end
end
Regards
robert
On 10/13/06, Robert K. [email protected] wrote:
On 12.10.2006 18:10, Kalman N. wrote:
class IO
def geti( endian = :little )
str = read( [0].pack(‘N’).length )
str.reverse! if endian == :little
str.unpack( ‘N’ )[0]
end
end
Ooooo … clever!
class IO
SIZEOF_INT = [0].pack(‘N’).length
def geti( endian = :little )
str = read( SIZEOF_INT )
str.reverse! if endian == :little
str.unpack( ‘N’ )[0]
end
end
I’m too lazy to benchmark it today, but is reverse! faster than
reverse on strings?
Blessings,
TwP
On Sat, 2006-10-14 at 02:42 +0900, Tim P. wrote:
end
end
I thought that ‘N’ was always a 32-bit in network byte order?
According to the docs, platform independent sizes are used everywhere
except the SsIiLl directives when escaped by an underscore…?
Tim P. wrote:
end
end
I’m too lazy to benchmark it today, but is reverse! faster than
reverse on strings?
Yes, most likely. No new object is created.
robert