Efficient processing of binary data streams in Ruby?

I’m writing a Ruby program that has to process binary data from files
and sockets. Data items are in bytes, 16-bit words, or 32-bit words,
and I cannot predict in advance whether the data will be msb-first or
lsb-first, so I end up writing things like this:

def unpack_16(x)
    @msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
end

def pack_16(x)
    y = "xx"
    if (@msb_first)
        y[0] = x>>8
        y[1] = x&255
    else
        y[0] = x&255
        y[1] = x>>8
    end
end

I expect, however, that this will be painfully slow, and I can’t
imagine that this hasn’t been though of before. Is there a better way
to do this that will result in much better performance?

Thanks!

On 3/8/07, [email protected] [email protected] wrote:

    y = "xx"

imagine that this hasn’t been though of before. Is there a better way
to do this that will result in much better performance?

def unpack_16( str )
@msb_first ? str.unpack(‘n’) : str.unpack(‘S’)
end

def pack_16( num )
@msb_first ? [num].pack(‘n’) : [num].pack(‘S’)
end

That will work for little-endian processors (Intel) but not for
big-endian processors (PowerPC, Sparc). For these methods to work on
the latter you’ll have to do something like this …

def unpack_16( str )
str = str.reverse unless @msb_first
str.unpack(‘n’)
end

def pack_16( num )
str = [num].pack(‘n’)
str.reverse unless @msb_first
end

Just define the desired method based on the processor type – which
can be figued out by doing this …

LITTLE_ENDIAN = [42].pack(‘I’)[0] == 42

if LITTLE_ENDIAN

define little endian methods here

else

define big endian methods here

end

Hope that helps

Blessings,
TwP

On Fri, 9 Mar 2007, [email protected] wrote:

   y = "xx"

that this hasn’t been though of before. Is there a better way to do this
that will result in much better performance?

this will be extremely fast for even huge buffers of data

harp:~ > ruby a.rb
huge(100000) LSB(8) in 0.00117683410644531s
huge(100000) LSB(16) in 0.00181722640991211s
huge(100000) LSB(32) in 0.00884389877319336s
huge(100000) MSB(8) in 0.00245118141174316s
huge(100000) MSB(16) in 0.0045168399810791s
huge(100000) MSB(32) in 0.0078279972076416s

harp:~ > cat a.rb
require ‘rubygems’
require ‘narray’

module Intification
LSB = :LSB
MSB = :MSB
HOST = [42].pack(‘i’).unpack(‘c’).first == 42 ? LSB : MSB

def ints bits = 8, order = LSB
words = bits / 8

 type =
   case bits.to_i
     when 8
       NArray::BYTE
     when 16
       NArray::SINT
     when 32
       NArray::INT
     else
       raise ArgumentError, bits.inspect
   end

 na = NArray.to_na to_s, type, size/words
 order == HOST ? na : na.swap_byte

end
end

class String
include Intification
end

def bm label
a = Time.now
yield
b = Time.now
puts “#{ label } in #{ b.to_f - a.to_f }s”
end

n = 100_000

huge = { :LSB => {}, :MSB => {} }

huge[:LSB][8] = [39,40,41,42].pack(‘c*’) * n
huge[:LSB][16] = [39,40,41,42].pack(‘s*’) * n
huge[:LSB][32] = [39,40,41,42].pack(‘i*’) * n

huge[:MSB][8] = [39,40,41,42].pack(‘c*’) * n
huge[:MSB][16] = [39,40,41,42].pack(‘n*’) * n
huge[:MSB][32] = [39,40,41,42].pack(‘N*’) * n

[:LSB, :MSB].each do |order|
[8,16,32].each do |bits|
bm “huge(#{ n }) #{ order.to_s}(#{ bits })” do
string = huge[order][bits]
ints = string.ints(bits, order)
last = ints[-4…-1]
raise unless last[0] = 39
raise unless last[1] = 40
raise unless last[2] = 41
raise unless last[3] = 42
end
end
end

regards.

if youre on windows i have an narray install

-a

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs