Help me improve Hash#rekey

The note TODO: Improve Hash#rekey code!!! has been in my docs for too
long. I could use other’s insights and thought others might enjoy the
challenge. So here’s the code:

require ‘facets/na’

class Hash

# Rekey a hash:
#
#   rekey()
#   rekey(from_key => to_key, ...)
#   rekey{|from_key| to_key}
#   rekey{|from_key, value| to_key}
#
# If a key map is given, then the first key is changed to the second

key.
#
# foo = { :a=>1, :b=>2 }
# foo.rekey(:a=>‘a’) #=> { ‘a’=>1, :b=>2 }
# foo.rekey(:b=>:x) #=> { :a =>1, :x=>2 }
# foo.rekey(‘foo’=>‘bar’) #=> { :a =>1, :b=>2 }
#
# If a block is given, converts all keys in the Hash accroding to
the
# given block procedure. If the block returns +NA+ for a given key,
# then that key will be left intact.
#
# foo = { :name=>‘Gavin’, :wife=>:Lisa }
# foo.rekey{ |k| k.to_s } #=> { “name”=>“Gavin”, “wife”=>:Lisa }
# foo #=> { :name =>“Gavin”, :wife=>:Lisa }
#
# If no key map or block is given, then all keys are converted
# to Symbols.
#
# Note that if both a +key_map+ and a block are given, the +key_map+
is
# applied first then the block.
#
# CREDIT: Trans, Gavin K.

def rekey(key_map=nil, &block)
  if !(key_map or block)
    block = lambda{|k| k.to_sym}
  end

  key_map ||= {}

  hash = dup.replace({})  # to keep default_proc

  (keys - key_map.keys).each do |key|
    hash[key] = self[key]
  end

  key_map.each do |from, to|
    hash[to] = self[from] if key?(from)
  end

  if block
    hash2 = dup.replace({})
    case block.arity
    when 2  # TODO: is this condition needed?
      hash.each do |k, v|
        nk = block.call(k,v)
        nk = (NA == nk ? k : nk)
        hash2[nk] = v
      end
    else
      hash.each do |k, v|
        nk = block.call(k)
        nk = (NA == nk ? k : nk)
        hash2[nk] = v
      end
    end
  else
    hash2 = hash
  end

  hash2
end

# Synonym for Hash#rekey, but modifies the receiver in place (and

returns it).
#
# foo = { :name=>‘Gavin’, :wife=>:Lisa }
# foo.rekey!{ |k| k.to_s } #=> { “name”=>“Gavin”, “wife”=>:Lisa
}
# foo #=> { “name”=>“Gavin”, “wife”=>:Lisa
}
#
# CREDIT: Trans, Gavin K.

def rekey!(key_map=nil, &block)
  replace(rekey(key_map, &block))
end

end

Not the use of facets/na. That is defined as:

class << NA = ArgumentError.new
  def inspect ; 'N/A' ; end
  def method_missing(*); self; end
end

But it is really nothing more than a dummy object used to mean Not
Applicable. So in the case of #rekey, if the block returns NA then the
key
goes unchanged. Thinking about it again now, it’s probably unnecessary,
but
I had wanted a way to say “leave it alone” while also making sure that
nil could still be used as a key (even if that’s rare). Feel free to
remove the NA business, but if you do please explain why you think its
not
needed.

Best solution will get their name put in front of CREDITs for the next
release of Facets.

On Sun, Dec 9, 2012 at 7:19 PM, Intransition [email protected]
wrote:

The note TODO: Improve Hash#rekey code!!! has been in my docs for too
long. I could use other’s insights and thought others might enjoy the
challenge. So here’s the code:

I think the problem is in the API. That’s what makes it too complex.
There are too many cases of specific handling and options (see below):

#
# then that key will be left intact.
  1. Unnecessary option: if the key is supposed to stay intact the block
    should just return the original key.
#
#   foo = { :name=>'Gavin', :wife=>:Lisa }
#   foo.rekey{ |k| k.to_s }  #=>  { "name"=>"Gavin", "wife"=>:Lisa }
#   foo                      #=>  { :name =>"Gavin", :wife=>:Lisa }
#
# If no key map or block is given, then all keys are converted
# to Symbols.
  1. Why that default? In my mind this is too much implicit logic.
    Also this can be easily achieved with

hash.rekey(&:to_sym)

#
# Note that if both a +key_map+ and a block are given, the +key_map+ is
# applied first then the block.
  1. I would change that to EITHER block OR map argument, but not both.

I had wanted a way to say “leave it alone” while also making sure that nil
could still be used as a key (even if that’s rare). Feel free to remove the
NA business, but if you do please explain why you think its not needed.

If one needs a special key one can use a Symbol for that as
efficiently as nil. nil is the value return if something is absent
and I believe it does not make for a good key in a Hash.

Best solution will get their name put in front of CREDITs for the next
release of Facets.

:slight_smile:

class Hash
def rekey(mapping = nil, &convert)
c = convert || mapping
dup.tap do |h| # preserve type and defaults
h.clear
each_pair {|k, v| h[c[k] || k] = v}
end
end
end

Kind regards

robert

On Monday, December 10, 2012 7:22:17 AM UTC-5, Robert K. wrote:

  1. Unnecessary option: if the key is supposed to stay intact the block
    should just return the original key.

I agree. Just wanted someone else to confirm. Out it goes!

hash.rekey(&:to_sym)

I can understand that. I used the default b/c the vast majority of the
time
I was using it to convert to symbols. Rather then have yet another
method
like ActiveSupport’s symbolize_keys it made more sense to me to just
give
#rekey this as the default.

I can’t really take that back now. It’s been too long part of the API.

#
# Note that if both a +key_map+ and a block are given, the +key_map+

is

# applied first then the block.
  1. I would change that to EITHER block OR map argument, but not both.

Yea, I thought about that when I added support for the mapping. That was
actually something that came later that the original block form. At
first I
had thought about making an entirely different method, but then thought
that was a waste and that it made more sense as part of #rekey. So when
I
first added it I did an “either or”, just as you suggest. But then I
thought “why?” it certainly can handle both even if people will almost
never use both.

What do you think? Does it really matters enough to change it now? Like
I
said, I doubt anyone has used both, so this is something that could be
change if it really is worth it.

goes unchanged. Thinking about it again now, it’s probably unnecessary,

Wouldn’t use symbols b/c then you have a special exception. NA was made
just for such cases. But you probably right that nil doesn’t make a
good
hash key no matter what. Nonetheless, it doesn’t really matter b/c as
you
said above, they can just return the original key.

  each_pair {|k, v| h[c[k] || k] = v}
end

end
end

Sweet. Much smaller than mine, that’s for damn sure!!! Put the default
:to_sym back in and we could have a deal :slight_smile:

I’m need to test and benchmark it first though.

Oh, and nice use of polymorphism using #[] for both proc and hash
retrieval!

On Mon, Dec 10, 2012 at 8:44 PM, Intransition [email protected]
wrote:

There are too many cases of specific handling and options (see below):

  1. Unnecessary option: if the key is supposed to stay intact the block
    should just return the original key.

I agree. Just wanted someone else to confirm. Out it goes!

:slight_smile:

hash.rekey(&:to_sym)

I can understand that. I used the default b/c the vast majority of the time
I was using it to convert to symbols. Rather then have yet another method
like ActiveSupport’s symbolize_keys it made more sense to me to just give
#rekey this as the default.

I can’t really take that back now. It’s been too long part of the API.

Well, then you mustn’t change anything which affects observable
behavior.

had thought about making an entirely different method, but then thought that
was a waste and that it made more sense as part of #rekey. So when I first
added it I did an “either or”, just as you suggest. But then I thought
“why?” it certainly can handle both even if people will almost never use
both.

What do you think? Does it really matters enough to change it now? Like I
said, I doubt anyone has used both, so this is something that could be
change if it really is worth it.

I would get rid of that behavior. After all, if someone wants do do
Hash lookups as part of conversion they can still do that in the block
without much effort.

I also noticed a slight asymmetry between block and Hash: a block will
return something for every key passed in. But since a Hash is fixed
at the time of method call (modulo default_proc of course) there is
not really a way to handle absent keys ad hoc.

goes unchanged. Thinking about it again now, it’s probably unnecessary,

:slight_smile:

Sweet. Much smaller than mine, that’s for damn sure!!!

Yeah, but that’s easy when one removes the complexity in behavior.

Put the default
:to_sym back in and we could have a deal :slight_smile:

Well, that’s easy, isn’t it?

def rekey(mapping = nil, &convert)
c = convert || mapping || :to_sym.to_proc

I’m need to test and benchmark it first though.

Have fun!

Oh, and nice use of polymorphism using #[] for both proc and hash retrieval!

Thank you! :slight_smile:

Kind regards

robert

On Tuesday, December 11, 2012 6:55:32 AM UTC-5, Robert K. wrote:

Oh, and nice use of polymorphism using #[] for both proc and hash
retrieval!

Thank you! :slight_smile:

Well, everything went well except for one thing, which was a bit
irritating
actually – I couldn’t find a way to concisely handle a block with
either
one argument (just the key) or two arguments (the key and the value).
Seems
like that should be doable in one block call using a splat. But I had to
have a condition on arity and two separate each_pair calls, one for each
case. Maybe it’s just not possible.

In any case, thanks for your help! The code is looking much better.