Bit Manipulation for a Pure Ruby SHA1 implementation

leveille · October 12, 2010, 6:35am

I’m trying to write a pure Ruby implementation of SHA1. In order to do
so, I need to be able to directly manipulate bits. I know about the
bitwise operators in Ruby, but I’m having issues with the way that
Ruby wraps things up in classes. Basically I’ve run into two problems.

PROBLEM 1: I need to pad the message by appending a 1-bit, and then a
certain number of 0s. When appending numbers (e.g. 0b1) to the end of
by string, Ruby wraps each number in its Fixnum class making it 1 byte
long: So basically adding 0b1 appends 0b0000_0001, which is 7 zeros
that shouldn’t be there. The fix I’ve found is I to tack 0x80 on the
end, which concats 0b1000_0000. The downside is that I have to keep
track of my zeros in batches of bytes (which at this point my
implementation handles alright), but I’d like to know how to do it bit
by bit if possible.

I’ve seen the String.unpack and Array.pack used for binary data.
However, I can only get the bits out in one glob str.unpack(“B*”), or
str.unpack(“B8”*string.length), which still just gives me a bunch of
stringified 8bit binary values. Seems like there should be a better
way.

PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Any help would be greatly appreciated. Thanks in advance.

leveille · October 12, 2010, 7:57am

On Mon, Oct 11, 2010 at 11:35 PM, Jason L.
[email protected]wrote:

end, which concats 0b1000_0000. The downside is that I have to keep
PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Any help would be greatly appreciated. Thanks in advance.

Hi, if you just need a portable SHA1 implementation, there is already
one in
the standard library.
http://ruby-doc.org/stdlib/libdoc/digest/rdoc/index.html

As for your problems, I’m finding myself really confused by your
terminology. For example, when you say “adding a one-bit and k zeros”
does
that mean num + 2**k, Or is num <<= 1 ; num += 1 ; num <<= k or
something
else even?

It might be more helpful if you show “this is what I have, this is what
I
want, this is how I tried to get there”

leveille · October 12, 2010, 10:26am

On Tue, Oct 12, 2010 at 6:35 AM, Jason L. [email protected]
wrote:

end, which concats 0b1000_0000. The downside is that I have to keep
PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Any help would be greatly appreciated. Thanks in advance.

I would create a class which is a thin wrapper around a String
instance and probably a length in bits. Then add methods that perform
bit manipulations that you need and use that in your implementation.

untested

class BitStream
NULL = “\000”.force_encoding “BINARY”

def initialize(s = “”, l = 0)
@s = s
@l = l
@s.force_encoding “BINARY”
end

def bit_length; @l; end

def add_bit(b)
b = case b
when false, 0, nil
0
else
1
end

self[@l] = b

end

def
raise “Index error” if index < 0 || index >= @l
byte, off = index.divmod 8
@s[byte].ord & (1 << off) == 0 ? 0 : 1
end

def []=(index, bit)
raise “Index error” if index < 0
byte, off = index.divmod 8

while @s.bytesize < byte
  @s << NULL
end

x = @s[byte].ord

@s[byte] = (if bit == 1
  x | (1 << off)
else
  x & (0xFF ^ (1 << off))
end).chr

end
end

Kind regards

robert

leveille · October 12, 2010, 10:05pm

On Oct 12, 1:38pm, Jarsen [email protected] wrote:

I’m trying to write a pure Ruby implementation of SHA1. In order to do
track of my zeros in batches of bytes (which at this point my
bit long integer representing the length of the message being hashed.
terminology. For example, when you say “adding a one-bit and k zeros” does
This is what I have so far for padding:

def SHA1 message
k = 448-((message.bitsize % block_bitsize) + 1)

  # EDIT
  # I think I need this here actually for certain cases
  k+=512 if k < 0

leveille · October 12, 2010, 9:40pm

On Oct 11, 11:56pm, Josh C. [email protected] wrote:

bitwise operators in Ruby, but I’m having issues with the way that
by bit if possible.
how do I add a 64bit binary representation of this number to the end
else even?

It might be more helpful if you show “this is what I have, this is what I
want, this is how I tried to get there”

Sorry, looking back on that I even confused myself. I am aware of the
SHA1 implementation in the standard library, I’m just trying to write
one myself using pure Ruby.

This is what I have so far for padding:

def SHA1 message
k = 448-((message.bitsize % block_bitsize) + 1)
m = message.unpack(“B*”)[0].to_i(2) << (k+65)
padding = 1
padding <<= k+64
padding |= message.bitsize # then add last 64 bit block (message
length) which is the len
m |= padding # add the padding to the message
#… split into 512-bit blocks
#… do SHA1 operations
end

Where bitsize is defined as:
class String; def bitsize; bytesize * 8; end; end

The method so far seems to be giving me the correct value for m.
However, my problem is with the leading zeros. For example, if the
message is "abc, message.unpack(“B*”)[0] returns
“011000010110001001100011”. When I call to_i(2) on that number, the
leading 0 bit will disappear - which makes sense because to 01 an 1
are both the same number.

However, when I begin to perform sha1 operations on those blocks I
will need that leading 0 back; in the example where message = “abc”,
m.to_s(2) or will only give me a 511 digit long output, when I need a
512 bit block. So I need to figure out how to keep those leading zeros
in place so I have the correct block size. Hopefully this makes more
sense.

leveille · October 14, 2010, 11:03am

Jason L. wrote in post #949355:

PROBLEM 1: I need to pad the message by appending a 1-bit, and then a
certain number of 0s. When appending numbers (e.g. 0b1) to the end of
by string, Ruby wraps each number in its Fixnum class making it 1 byte
long: So basically adding 0b1 appends 0b0000_0001, which is 7 zeros
that shouldn’t be there. The fix I’ve found is I to tack 0x80 on the
end, which concats 0b1000_0000. The downside is that I have to keep
track of my zeros in batches of bytes (which at this point my
implementation handles alright), but I’d like to know how to do it bit
by bit if possible.

A String in ruby is a collection of bytes, not bits.

I think you’ll find that as long as your input to your sha1 algorithm is
also a String, and thus a multiple of 8 bits, everything will work fine.

The way the sha1 algorithm is defined it can be implemented for an input
which is not a multiple of 8 bits, but as you’ve found, it’s not easy to
represent that anyway Therefore, if you want your implementation to
work with strange numbers of bits, you’ll first need to choose an input
representation.

If I were you, I’d be happy to limit my implementation to inputs which
are whole bytes. (After all, even a file on the filesystem is a whole
number of bytes)

PROBLEM 2: After adding a one-bit and k zeros, I need to append a 64-
bit long integer representing the length of the message being hashed.
It seem like Fixnum is 8 bytes long on my machine, so the question is
how do I add a 64bit binary representation of this number to the end
of a string?

Don’t worry about Fixnum versus Bignum, that’s internal to the
implementation and mostly hidden. Even on a 32-bit machine you can make
Integers which are 64-bits and above.

Unfortunately, Array#pack(“Q”) works differently on little-endian and
big-endian machines:

irb(main):001:0> [255].pack(“Q”)
=> “\377\000\000\000\000\000\000\000”

So for maximum portability I’d do two instances of “N”, i.e.

n = 0x0123456789abcdef
=> 81985529216486895

[n >> 32, n].pack(“NN”)
=> “\001#Eg\211\253\315\357”

which you can check is correct like this:

[n >> 32, n].pack(“NN”).unpack(“H*”)
=> [“0123456789abcdef”]

Regards,

Brian.