Forum: Ruby openssl and SHA*

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Philippe C. (Guest)
on 2009-03-04 21:07
Hi,

I'm quite new to ruby, and I'm facing a problem I can't seem to be able
to solve  by myself..
I'm comparing openssl sha1 hash results from a linux command line, to
ruby ones :
---
cmd line :
openssl dgst -sha1 my_file

ruby :
require 'digest/sha1'
puts Digest::SHA1.hexdigest(File.read("my_file"))
---
I increase the file and run it again, and again.
The hashes are similars until the file size reaches 512Mo, from then
they differs.
I tried several sha versions (sha256..sha512) and the problem is the
same.
However with MD5, I have no problem.

Anyone has an idea if I'm doing something wrong here ?
Thanks a lot !

ChoBolT
Brian C. (Guest)
on 2009-03-04 23:13
Philippe Chotard wrote:
> I'm comparing openssl sha1 hash results from a linux command line, to
> ruby ones :
> ---
> cmd line :
> openssl dgst -sha1 my_file
>
> ruby :
> require 'digest/sha1'
> puts Digest::SHA1.hexdigest(File.read("my_file"))
> ---
> I increase the file and run it again, and again.
> The hashes are similars until the file size reaches 512Mo, from then
> they differs.

Strange. First, try doing it in two stages:

str = File.read("my_file")
puts str.size
puts Digest::SHA1.hexdigest(str)

This may give you a clue if File.read is misbehaving. However this is
unlikely if Digest::MD5 is fine.

But in any case, reading 512MB of data into RAM just to calculate SHA1
is very wasteful. I suggest you recode it:

  puts Digest::SHA1.file("my_file").hexdigest

or read the file in blocks:

  d = Digest::SHA1.new
  File.open("my_file") do |f|
    while chunk = f.read(65536)
      d << chunk
    end
  end
  puts d.hexdigest

If you *still* get the same answer, then perhaps the command-line tool
you are comparing against is at fault! Most Linux systems have at least
two:

  sha1sum <file>
  openssl sha1 <file>

so you can see if those agree or disagree, too.

On my box (Ubuntu Hardy, ruby-1.8.6p114 compiled from source):

$ ls -l ubuntu-8.04-desktop-i386.iso
-rw-r--r-- 1 brian brian 733079552 Apr 24  2008
ubuntu-8.04-desktop-i386.iso
$ sha1sum ubuntu-8.04-desktop-i386.iso
53a07a006d791f7fddc6d53879e826934f73bc0f  ubuntu-8.04-desktop-i386.iso
$ openssl dgst -sha1 ubuntu-8.04-desktop-i386.iso
SHA1(ubuntu-8.04-desktop-i386.iso)=
53a07a006d791f7fddc6d53879e826934f73bc0f
$ irb
irb(main):001:0> require 'digest/sha1'
=> true
irb(main):002:0>
Digest::SHA1.file("ubuntu-8.04-desktop-i386.iso").hexdigest
=> "53a07a006d791f7fddc6d53879e826934f73bc0f"
irb(main):003:0> d = Digest::SHA1.new
=> #<Digest::SHA1: da39a3ee5e6b4b0d3255bfef95601890afd80709>
irb(main):004:0> File.open("ubuntu-8.04-desktop-i386.iso") do |f|
irb(main):005:1* while chunk = f.read(65536)
irb(main):006:2> d << chunk
irb(main):007:2> end
irb(main):008:1> end
=> nil
irb(main):009:0> d.hexdigest
=> "53a07a006d791f7fddc6d53879e826934f73bc0f"
irb(main):010:0>

So I can't see any problem. However I don't really have enough RAM to
read the file all in at once without swapping badly. It's possible that
Digest::SHA1 barfs when given a string > 512MB.

Regards,

Brian.
Philippe C. (Guest)
on 2009-03-05 14:23
Brian C. wrote:
> But in any case, reading 512MB of data into RAM just to calculate SHA1
> is very wasteful. I suggest you recode it:
>
>   puts Digest::SHA1.file("my_file").hexdigest

Thanks for your response Brian.
Indeed, using this method I got the right hash. So it looks like as you
said, that the problem is appearing when sha is handling 512MB+ strings.

I'll do some further testing on other systems and versions (Using ruby
1.9.0 on a debian lenny)

Thanks !
This topic is locked and can not be replied to.