YAML serialization of Hash with Set key not loadable

YAML serialization seems to be failing for me with any structure that
involves a hash with a set key. Is there documentation anywhere on what
sorts of objects are able to be serialized/deserialized via YAML? Or if
this is a bug that’s fixed in a newer ruby version (I’m using
1.9.1p378)? I would have expected this to work…

Example failing case
YAML.load(YAML.dump({[1].to_set=>1}))

Set itself seems to work fine
YAML.load(YAML.dump([1,2,3].to_set)

ruby -v
ruby 1.9.1p378 (2010-01-10 revision 26273) [x86_64-linux]
irb

irb(main):001:0> require ‘yaml’
=> true
irb(main):002:0> require ‘set’
=> true
irb(main):003:0> YAML.load(YAML.dump({[1].to_set=>1}))
ArgumentError: syntax error on line 2, col -1: ` hash:
1: true
: 1


from /usr/lib/ruby/1.9.1/yaml.rb:133:in load' from /usr/lib/ruby/1.9.1/yaml.rb:133:inload’
from (irb):11
from /usr/bin/irb:12:in `’

On Wed, Oct 20, 2010 at 03:27:16AM +0900, Tim G. wrote:

YAML serialization seems to be failing for me with any structure that
involves a hash with a set key. Is there documentation anywhere on what
sorts of objects are able to be serialized/deserialized via YAML? Or if
this is a bug that’s fixed in a newer ruby version (I’m using
1.9.1p378)? I would have expected this to work…

This is a bug in Syck (the yaml parser / emitter in Ruby < 1.9.2).

The bug is fixed in Psych. Psych ships with Ruby 1.9.2:

irb(main):001:0> RUBY_VERSION
=> "1.9.2"
irb(main):002:0> require 'psych'
=> true
irb(main):003:0> require 'set'
=> true
irb(main):004:0> Psych.load(Psych.dump({[1].to_set=>1}))
=> {#<Set: {1}>=>1}
irb(main):005:0> Psych.load(Psych.dump([1,2,3].to_set)
irb(main):006:1> )
=> #<Set: {1, 2, 3}>
irb(main):007:0>

On Oct 19, 1:27pm, Tim G. [email protected] wrote:

YAML serialization seems to be failing for me with any structure that
involves a hash with a set key. Is there documentation anywhere on what
sorts of objects are able to be serialized/deserialized via YAML? Or if
this is a bug that’s fixed in a newer ruby version (I’m using
1.9.1p378)?

I’m seeing the same behavior in 1.8.6p399 and 1.9.2rc2 so it’s
probably a bug that’s been around for a while.

I don’t know the reason for the issue since it seems to be somewhere
in the Syck library. In any Ruby I’ve tested:

ruby-1.9.2-rc2 > puts YAML.dump({[1].to_set=>1})

!ruby/object:Set ?
hash:
1: true
: 1

=> nil

while the correct YAML is:


?
!ruby/object:Set
hash:
1: true
: 1

I guess that means it’s a Syck bug, but this is all from a quick look
so I’m not sure.

Jeremy

Can anyone explain me a bit more about the use of set key in hash??

I didn’t aware about the Set class.

Vadivelan

yermej wrote in post #955644:

On Oct 19, 1:27pm, Tim G. [email protected] wrote:

YAML serialization seems to be failing for me with any structure that
involves a hash with a set key. Is there documentation anywhere on what
sorts of objects are able to be serialized/deserialized via YAML? Or if
this is a bug that’s fixed in a newer ruby version (I’m using
1.9.1p378)?

I’m seeing the same behavior in 1.8.6p399 and 1.9.2rc2 so it’s
probably a bug that’s been around for a while.

I don’t know the reason for the issue since it seems to be somewhere
in the Syck library. In any Ruby I’ve tested:

ruby-1.9.2-rc2 > puts YAML.dump({[1].to_set=>1})

!ruby/object:Set ?
hash:
1: true
: 1

=> nil

while the correct YAML is:


?
!ruby/object:Set
hash:
1: true
: 1

I guess that means it’s a Syck bug, but this is all from a quick look
so I’m not sure.

Jeremy

Thanks all for the help. Good to know that the issue is fixed in ruby
1.9.2 Psych library. I’ll have try our switching our Rails app to that
when we upgrade to 1.9.2.

set seems to get no love sometimes, I remember back in ruby 1.8.6 it
didn’t even hash correctly (i.e. in 1.8.6 which we were stuck on for a
long time
[1,2,3].to_set == [3,2,1].to_set => true
[1,2,3].to_set.hash == [3,2,1].to_set.hash => false
). That at least was fixed in ruby 1.8.7/1.9.

My workaround back then was to use sorted uniqued arrays as the key,
which is what I’ll continue to do for the yaml bug too.


As to why you might use the set class as a hash key, in ruby any object
can be used as a hash key (assuming it implements hash consistently) and
set is a good way to represent a list of unique things where the order
doesn’t matter.

For a made up example I want to cache some computed value in a hash that
depends on a set of parameters with no order dependency. Say the sum of
3 unique values.

require ‘set’
require ‘pp’
def foo(a,b,c)
(@cached ||= {})[[a,b,c].to_set] ||= (sleep 10 ; [a,b,c].inject(&:+))
end

foo(1,2,3)
=>6 (slow, 10 seconds)

foo(1,2,3)
=>6 (fast cached)

foo(3,2,1)
=>6 (fast cached)

pp @cached
{#<Set: {1, 2, 3}>=>6}

If you used an array as the cache key instead of a set then the order
would matter and foo(3,2,1) would have been slow and @cached larger. My
actual use case is for something vaguely like this (very expensive
computation on list of unique elements where order doesn’t matter)