It had to happen eventually... We received a pull request recently for a change that makes JRuby's hashing of Strings, Booleans, nil, and Symbols be consistent. Basically, it provides hardcoded hashes for Booleans and nil, and makes it possible to disable seeded hashes for String and Symbol. PR: https://github.com/jruby/jruby/pull/590 My question for ruby-core: at what point did you decide to make hash for e.g. nil not be a single value (it was "4" in 1.8.7 and different/random in 1.9.3+), and why did you do it? I think it's valid to want to be able to consistently hash these values across runtimes, but I want to understand the implications before I merge this patch into JRuby proper. Thoughts? - Charlie
on 2013-03-21 01:03
on 2013-03-21 01:51
hi, On Thu, Mar 21, 2013 at 9:03 AM, Charles Oliver Nutter <headius@headius.com> wrote: > My question for ruby-core: at what point did you decide to make hash > for e.g. nil not be a single value (it was "4" in 1.8.7 and > different/random in 1.9.3+), and why did you do it? Wait, `nil.id` returns NoMethodError, `nil.__id__` returns '4' (1.9.3) or '8' (2.0.0+), not a random value. `Object#__id__` shows a value of `VALUE` type in internal of its object, And `4` and `8` means `nil`, and it's a fixed value (in internal). VALUE is a type like a pointer in C, and it shows a object in Ruby level. and some types are not pointer, shows object directly. (for examples, Fixnum and Flonum, true, false, and nil.)
on 2013-03-21 02:22
Hello Shota, On 2013/03/21 9:50, Shota Fukumori (sora_h) wrote: > On Thu, Mar 21, 2013 at 9:03 AM, Charles Oliver Nutter > <headius@headius.com> wrote: >> My question for ruby-core: at what point did you decide to make hash >> for e.g. nil not be a single value (it was "4" in 1.8.7 and >> different/random in 1.9.3+), and why did you do it? > > Wait, `nil.id` returns NoMethodError, `nil.__id__` returns '4' (1.9.3) > or '8' (2.0.0+), > not a random value. The question is about nil.hash, not about nil.__id__. Regards, Martin.
on 2013-03-21 02:32
On 21 March 2013 10:50, Shota Fukumori (sora_h) <sorah@tubusu.net> wrote: > not a random value. > In 1.8, nil.hash => 4 In 1.9, nil.hash => random Fixnum, e.g. 192286870168610551 Ditto 2.0
on 2013-03-21 03:03
hash romdomization is introduced at r17465 by akr. akr explained it is for algorithmic complexity attack in [ruby-dev:37778]. http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach... 2013/3/21 "Martin J. Drst" <duerst@it.aoyama.ac.jp>
on 2013-03-21 03:55
Hello Charlie, On 2013/03/21 9:03, Charles Oliver Nutter wrote: > for e.g. nil not be a single value (it was "4" in 1.8.7 and > different/random in 1.9.3+), and why did you do it? Yui already gave a pointer. Actually, neither NilClass nor TrueClass nor FalseClass implement #hash. All three fall back to Object. So do Symbol and Fixnum. So it seems to be mainly the result of not doing anything more than necessary in terms of implementation against the potential DOS attacks. > I think it's valid to want to be able to consistently hash these > values across runtimes, but I want to understand the implications > before I merge this patch into JRuby proper. > > Thoughts? I can't currently see a security problem with making the hash values of nil, true, and false consistent across runtimes. But then that's not a guarantee that there are none (I'm not a security expert). And I don't see a reason for making only these three stable. When it comes to Symbols, we get back to the question to what extent Symbols may/can/shouldn't be created based on data coming in from the outside of an application, which we discussed related to garbage collection of symbols. Overall, having a switch to eliminate introducing randomness into hash values may be something to consider. But it will produce problems when an application is put together from various libraries: Some of these libraries may depend on hashes being stable, while some others may be open to DOS attacks when hashes are stable. So maybe those who need stable hashes should implement #stable_hash methods, and if it turns out that this is used often, we can add it to Ruby itself. Regards, Martin.
on 2013-03-21 18:52
2013/3/21 "Martin J. Drst" <duerst@it.aoyama.ac.jp> >> makes it possible to disable seeded hashes for String and Symbol. > Fixnum. So it seems to be mainly the result of not doing anything more than > I can't currently see a security problem with making the hash values of > values may be something to consider. But it will produce problems when an > application is put together from various libraries: Some of these libraries > may depend on hashes being stable, while some others may be open to DOS > attacks when hashes are stable. > > I agree that NilClass, TrueClass and FalseClass are probably not an issue since they are singletons. The core problem is a user being able to predictably create different objects that hash to the same value. Symbol, Fixnum and Bignum fall back to Object/Kernel#hash, and Kernel#hash uses MurmurHash [1] & [2]. While randomized, Jean-Philippe Aumasson and Daniel Bernstein showed last year that a randomized MurmurHash isn't enough in general to prevent this kind of denial of service attacks. The proof of concept can be found at [3] and in response, SipHash was introduced, replacing MurmurHash in rb_mem_hash [4]. [1] https://github.com/ruby/ruby/blob/7e052c7b81d22fe4... [2] https://github.com/ruby/ruby/blob/7e052c7b81d22fe4... [3] https://github.com/emboss/schadcode/tree/master/ruby/cruby [4] https://github.com/ruby/ruby/blob/f7108c5d1a10cacf... Off the top of my head I would argue that the same technique cannot be applied to Kernel#hash. The argument to be hashed is bounded and just a few bytes long - for the attack to work we would need potentially unbounded byte arrays. Of course, the safest thing would probably be switching Kernel#hash to SipHash, too. I'm not sure to what extent this would affect performance. Maybe we could follow the JRuby approach - MurmurHash was replaced with Perl's hash, with the option to switch to SipHash on demand (@headius - this is correct?). Jean-Philippe confirmed that Perl's hash has not been broken (yet), but it's not unlikely that this could happen in the future. Since we already have SipHash in CRuby, it would be relatively easy to do the same thing, I guess. Another upshot would be that Perl's hash is faster than MurmurHash, too. What do you think? So maybe those who need stable hashes should implement #stable_hash > methods, and if it turns out that this is used often, we can add it to Ruby > itself. > > I like the idea. It is stated in the docs [5] that nobody should assume the values to be consistent across VM boundaries, doing otherwise would clearly violate this contract. -Martin [5] http://www.ruby-doc.org/core-2.0/Object.html#method-i-hash
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.