Forum: Ruby-core Consistent hashing in the face of HashDOS?

Posted by Charles Nutter (headius)
on 2013-03-21 01:03
(Received via mailing list)
It had to happen eventually...

We received a pull request recently for a change that makes JRuby's
hashing of Strings, Booleans, nil, and Symbols be consistent.
Basically, it provides hardcoded hashes for Booleans and nil, and
makes it possible to disable seeded hashes for String and Symbol.

PR: https://github.com/jruby/jruby/pull/590

My question for ruby-core: at what point did you decide to make hash
for e.g. nil not be a single value (it was "4" in 1.8.7 and
different/random in 1.9.3+), and why did you do it?

I think it's valid to want to be able to consistently hash these
values across runtimes, but I want to understand the implications
before I merge this patch into JRuby proper.

Thoughts?

- Charlie
Posted by Shota Fukumori (sora_h) (Guest)
on 2013-03-21 01:51
(Received via mailing list)
hi,

On Thu, Mar 21, 2013 at 9:03 AM, Charles Oliver Nutter
<headius@headius.com> wrote:
> My question for ruby-core: at what point did you decide to make hash
> for e.g. nil not be a single value (it was "4" in 1.8.7 and
> different/random in 1.9.3+), and why did you do it?

Wait, `nil.id` returns NoMethodError, `nil.__id__` returns '4' (1.9.3)
or '8' (2.0.0+),
not a random value.

`Object#__id__` shows a value of `VALUE` type in internal of its object,
And `4` and `8` means `nil`, and it's a fixed value (in internal).

VALUE is a type like a pointer in C, and it shows a object in Ruby 
level.
and some types are not pointer, shows object directly.
(for examples, Fixnum and Flonum, true, false, and nil.)
Posted by "Martin J. Dürst" <duerst@it.aoyama.ac.jp> (Guest)
on 2013-03-21 02:22
(Received via mailing list)
Hello Shota,

On 2013/03/21 9:50, Shota Fukumori (sora_h) wrote:
> On Thu, Mar 21, 2013 at 9:03 AM, Charles Oliver Nutter
> <headius@headius.com>  wrote:
>> My question for ruby-core: at what point did you decide to make hash
>> for e.g. nil not be a single value (it was "4" in 1.8.7 and
>> different/random in 1.9.3+), and why did you do it?
>
> Wait, `nil.id` returns NoMethodError, `nil.__id__` returns '4' (1.9.3)
> or '8' (2.0.0+),
> not a random value.

The question is about nil.hash, not about nil.__id__.

Regards,   Martin.
Posted by Matthew Kerwin (mattyk)
on 2013-03-21 02:32
(Received via mailing list)
On 21 March 2013 10:50, Shota Fukumori (sora_h) <sorah@tubusu.net> 
wrote:

> not a random value.
>

In 1.8, nil.hash => 4
In 1.9, nil.hash => random Fixnum, e.g. 192286870168610551
Ditto 2.0
Posted by NARUSE, Yui (Guest)
on 2013-03-21 03:03
(Received via mailing list)
hash romdomization is introduced at r17465 by akr.
akr explained it is for algorithmic complexity attack in 
[ruby-dev:37778].

http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach...



2013/3/21 "Martin J. Drst" <duerst@it.aoyama.ac.jp>
Posted by "Martin J. Dürst" <duerst@it.aoyama.ac.jp> (Guest)
on 2013-03-21 03:55
(Received via mailing list)
Hello Charlie,

On 2013/03/21 9:03, Charles Oliver Nutter wrote:
> for e.g. nil not be a single value (it was "4" in 1.8.7 and
> different/random in 1.9.3+), and why did you do it?

Yui already gave a pointer. Actually, neither NilClass nor TrueClass nor
FalseClass implement #hash. All three fall back to Object. So do Symbol
and Fixnum. So it seems to be mainly the result of not doing anything
more than necessary in terms of implementation against the potential DOS
attacks.


> I think it's valid to want to be able to consistently hash these
> values across runtimes, but I want to understand the implications
> before I merge this patch into JRuby proper.
>
> Thoughts?

I can't currently see a security problem with making the hash values of
nil, true, and false consistent across runtimes. But then that's not a
guarantee that there are none (I'm not a security expert). And I don't
see a reason for making only these three stable.

When it comes to Symbols, we get back to the question to what extent
Symbols may/can/shouldn't be created based on data coming in from the
outside of an application, which we discussed related to garbage
collection of symbols.

Overall, having a switch to eliminate introducing randomness into hash
values may be something to consider. But it will produce problems when
an application is put together from various libraries: Some of these
libraries may depend on hashes being stable, while some others may be
open to DOS attacks when hashes are stable.

So maybe those who need stable hashes should implement #stable_hash
methods, and if it turns out that this is used often, we can add it to
Ruby itself.

Regards,   Martin.
Posted by "Martin Boßlet" <martin.bosslet@gmail.com> (Guest)
on 2013-03-21 18:52
(Received via mailing list)
2013/3/21 "Martin J. Drst" <duerst@it.aoyama.ac.jp>

>> makes it possible to disable seeded hashes for String and Symbol.
> Fixnum. So it seems to be mainly the result of not doing anything more than
> I can't currently see a security problem with making the hash values of
> values may be something to consider. But it will produce problems when an
> application is put together from various libraries: Some of these libraries
> may depend on hashes being stable, while some others may be open to DOS
> attacks when hashes are stable.
>
>

I agree that NilClass, TrueClass and FalseClass are probably not an 
issue
since they are singletons. The core problem is a user being able to
predictably create different objects that hash to the same value. 
Symbol,
Fixnum and Bignum fall back to Object/Kernel#hash, and Kernel#hash uses
MurmurHash [1] & [2].

While randomized, Jean-Philippe Aumasson and Daniel Bernstein showed 
last
year that a randomized MurmurHash isn't enough in general to prevent 
this
kind of denial of service attacks. The proof of concept can be found at 
[3]
and in response, SipHash was introduced, replacing MurmurHash in
rb_mem_hash [4].

[1]
https://github.com/ruby/ruby/blob/7e052c7b81d22fe4...
[2]
https://github.com/ruby/ruby/blob/7e052c7b81d22fe4...
[3] https://github.com/emboss/schadcode/tree/master/ruby/cruby
[4]
https://github.com/ruby/ruby/blob/f7108c5d1a10cacf...

Off the top of my head I would argue that the same technique cannot be
applied to Kernel#hash. The argument to be hashed is bounded and just a 
few
bytes long - for the attack to work we would need potentially unbounded
byte arrays.

Of course, the safest thing would probably be switching Kernel#hash to
SipHash, too. I'm not sure to what extent this would affect performance.
Maybe we could follow the JRuby approach - MurmurHash was replaced with
Perl's hash, with the option to switch to SipHash on demand (@headius -
this is correct?). Jean-Philippe confirmed that Perl's hash has not been
broken (yet), but it's not unlikely that this could happen in the 
future.
Since we already have SipHash in CRuby, it would be relatively easy to 
do
the same thing, I guess. Another upshot would be that Perl's hash is 
faster
than MurmurHash, too. What do you think?

So maybe those who need stable hashes should implement #stable_hash
> methods, and if it turns out that this is used often, we can add it to Ruby
> itself.
>
>
I like the idea. It is stated in the docs [5] that nobody should assume 
the
values to be consistent across VM boundaries, doing otherwise would 
clearly
violate this contract.

-Martin

[5] http://www.ruby-doc.org/core-2.0/Object.html#method-i-hash
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.