IO.readint?

Hi all,
I’m parsing a binary file, and need to read an integer, something I
would do in C like this:

int b;
read(f, &b, sizeof(int));

obviously considering endianness. I’m pretty sure there has to be a
faster way to do it, but this is how I’m doing it right now (as you
can see, pretty naive):

class IO

read int, assume little endian

def geti
c1 = getc
c2 = getc
c3 = getc
c4 = getc
c4 << 38 | c3 << 28 | c2 << 8 | c1
end
end

What would be the ruby-way to do it?
thanks for any tip…

On 10/12/06, Rolando A. [email protected] wrote:

What would be the ruby-way to do it?
thanks for any tip…

class IO
def geti( endian = :little )
str = self.read( 4 )
str = str.reverse if endian == :little
str.unpack( ‘N’ )[0]
end
end

The default for this method is to return the integer in little endian
byte order. You can change this by passing :big as an argument …

io.geti :big

It does not have to be :big, but I’m just following the metaphor of
using :little for little endian byte order.

Blessings,
TwP

On Fri, 2006-10-13 at 00:35 +0900, Rolando A. wrote:

Hi all,
I’m parsing a binary file, and need to read an integer,

Check out String#unpack:

(dta = File.read(‘ints.bin’)).unpack(‘I’ * (dta.length / 4))

=> [1234, 2345, 3456]

It has different type specifiers for endianness and so on. Also, if you
gotta crawl through the file, you can tell IO#read how many bytes you
want:

File.open(‘ints.bin’) { |f| puts f.read(4).unpack(‘I’) until f.eof? }

1234

2345

3456

=> nil

Thanks a lot, I’ll try that!!!

On 10/12/06, Rolando A. [email protected] wrote:

Thanks a lot, I’ll try that!!!

No problem. Take a look at bit-struct if you find yourself needing to
do some more complex packing and unpacking of binary data …

http://raa.ruby-lang.org/project/bit-struct/

Blessings,
TwP

On Fri, 13 Oct 2006, Tim P. wrote:

TwP
have you been using this for your stuff tim?

-a

On Fri, 13 Oct 2006, Tim P. wrote:

have you been using this for your stuff tim?

No, we’ve just been parsing very large pixel images. No complex data
structures. Read four bytes, mask off the hamming code and error
bits, store the pixel data in an mmap cache, repeat until EOF.

have you looked into using narray? then you can mask the entire image
at
once. i have code that turns an mmap into an narray - it’s quite
simple. got
a sample file?

-a

On 10/12/06, [email protected] [email protected] wrote:

have you been using this for your stuff tim?

No, we’ve just been parsing very large pixel images. No complex data
structures. Read four bytes, mask off the hamming code and error
bits, store the pixel data in an mmap cache, repeat until EOF.

From the bit-struct readme …

“BitStruct is most efficient when your data is primarily treated as a
binary string, and only secondarily treated as a data structure. (For
instance, you are routing packets from one socket to another, possibly
looking at one or two fields as it passes through or munging some
headers.) If accessor operations are a bottleneck, a better approach
is to define a class that wraps an array and uses pack/unpack when the
object needs to behave like a binary string.”

TwP

Tim P.:

On 10/12/06, Rolando A. [email protected] wrote:

int b;
read(f, &b, sizeof(int));
class IO
def geti( endian = :little )
str = self.read( 4 )
str = str.reverse if endian == :little
str.unpack( ‘N’ )[0]
end
end

Yet you cannot be sure that sizeof(int) is 4.

Kalman

On 10/12/06, Kalman N. [email protected] wrote:

Yet you cannot be sure that sizeof(int) is 4.

no, but in my case works just fine. it will always be a 4 byte unsigned
integer.

Kalman

regards,

Rolando A. wrote:

rolando – [[ knowledge is empty, fill it ]] –
“Tam pro papa quam pro rege bibunt omnes sine lege.”

Quicquid Venus imperat, labor est suavis…

:wink:
Hal

Robert K.:

Kalman N. wrote:

Tim P.:

str = self.read( 4 )
Yet you cannot be sure that sizeof(int) is 4.
str = read( [0].pack(‘N’).length )

Hey, only now I learnt that sizeof(int) is 4 even on my amd64 machine. I
had to
check that with a C program to make me believe it.

Kalman

On 12.10.2006 18:10, Kalman N. wrote:

end

Yet you cannot be sure that sizeof(int) is 4.

Kalman

class IO
def geti( endian = :little )
str = read( [0].pack(‘N’).length )
str.reverse! if endian == :little
str.unpack( ‘N’ )[0]
end
end

Regards

robert

On 10/13/06, Robert K. [email protected] wrote:

On 12.10.2006 18:10, Kalman N. wrote:

class IO
def geti( endian = :little )
str = read( [0].pack(‘N’).length )
str.reverse! if endian == :little
str.unpack( ‘N’ )[0]
end
end

Ooooo … clever!

class IO
SIZEOF_INT = [0].pack(‘N’).length

def geti( endian = :little )
str = read( SIZEOF_INT )
str.reverse! if endian == :little
str.unpack( ‘N’ )[0]
end
end

I’m too lazy to benchmark it today, but is reverse! faster than
reverse on strings?

Blessings,
TwP

On Sat, 2006-10-14 at 02:42 +0900, Tim P. wrote:

end
end

I thought that ‘N’ was always a 32-bit in network byte order?
According to the docs, platform independent sizes are used everywhere
except the SsIiLl directives when escaped by an underscore…?

Tim P. wrote:

end
end

I’m too lazy to benchmark it today, but is reverse! faster than
reverse on strings?

Yes, most likely. No new object is created.

robert