"map" a deeply nested structure: Object#deep_map

Hi,

most of you probably know/use Ruby F.s

http://rubyworks.github.com/facets/

I’ve recently posted a question on Facets discussion group:

http://groups.google.com/group/facets-universal/browse_thread/thread/683215a35f06af36

and I also tried my implementation (which apparently works fine):

class Object

  def deep_map(&block)
    if self.respond_to? :each_pair
      out = {}
      self.each_pair do |k, v|
        out[k] = v.deep_map(&block)
      end
      return out
    elsif self.respond_to? :each
      out = []
      self.each do |x|
        out << x.deep_map(&block)
      end
      return out
    else
      return block.call(self)
    end
  end

end

Is there room for improvements? Facets author suggested to start a
discussion here too.

So: what is, in your opinion, the best way to map a deeply nested
structure made up of Arrays, Hashes, Array of Hashes etc. etc. ?

Thanks,
Guido

On Wed, Oct 6, 2010 at 4:43 PM, Guido De Rosa [email protected]
wrote:

and I also tried my implementation (which apparently works fine):

http://snurl.com/19n6qh

class Object

 def deep_map(&block)
   if self.respond_to? :each_pair
     out = {}
     self.each_pair do |k, v|
       out[k] = v.deep_map(&block)

What about keys? Hash#map allows to map keys and values

irb(main):001:0> h={1=>2,3=>4}
=> {1=>2, 3=>4}
irb(main):002:0> h.map {|k,v| [k10,v100]}
=> [[10, 200], [30, 400]]

What, if the key is an Array or Hash? Wouldn’t this be more
appropriate?

       out[k.deep_map(&block)] = v.deep_map(&block)

OTOH then you cannot map key and value together.

   end
 end

end

Is there room for improvements? Facets author suggested to start a
discussion here too.

You could remove lines “out = …” and replace them by

out = self.class.new

So: what is, in your opinion, the best way to map a deeply nested
structure made up of Arrays, Hashes, Array of Hashes etc. etc. ?

I wonder how many use cases for this there are actually. In your
example you can uniformly treat each object since you want the
object_id. That would work for a few other methods as well (e.g.
#inspect, #to_s). In other cases you would have to discriminate
treatment of leave values in your block, e.g.

x.deep_map |y|
case y
when String

when Fixnum

else

end

That list could become lengthy. Basically you implemented tree
traversal with double dispatch - only that the double dispatch must be
done manually in the block. :slight_smile:

Kind regards

robert

What about keys? Hash#map allows to map keys and values

irb(main):001:0> h={1=>2,3=>4}
=> {1=>2, 3=>4}
irb(main):002:0> h.map {|k,v| [k10,v100]}
=> [[10, 200], [30, 400]]

What, if the key is an Array or Hash? Wouldn’t this be more
appropriate?

       out[k.deep_map(&block)] = v.deep_map(&block)

I was not interested in mapping keys, but this would be a reasonable
extra feature.

So deep_map would be used like that:

object.deep_map{|k, v| ... [result_key, result_value]}

resembling Hash#map behavior

and when object is not Hash-like, result_key would be simply ignored.

It should not be hard to allow client code to use also “the previous
API”:

object.deep_map{|x| ... }

by checking block.arity in the implementation code.

You could remove lines “out = …” and replace them by

out = self.class.new

Good point… apparently!

What if self is a Range?

Forcing to return an Array (or a Hash) is not so bad: even the standard
method Enumerable#map does so! And so does Enumerable#sort and many
others: it’s just to avoid nonsensical situations :slight_smile:

I wonder how many use cases for this there are actually. In your
example you can uniformly treat each object since you want the
object_id. That would work for a few other methods as well (e.g.
#inspect, #to_s). In other cases you would have to discriminate
treatment of leave values in your block, e.g.

x.deep_map |y|
case y
when String

when Fixnum

else

end

That list could become lengthy. Basically you implemented tree
traversal with double dispatch - only that the double dispatch must be
done manually in the block. :slight_smile:

The same holds for standard Enumerable#map and Hash#map, so…

I just want to write the recursive version of some Ruby core methods,
resembling by many aspects pretty much the same behavior, including the
fact that some checks are up to the user :wink:

G.

Robert K. wrote:

OTOH you say you want to mimic #map behavior and since this is part of
the standard behavior people might expect to be able to map keys as
well if this goes into a library.

Right.

� �object.deep_map{|x| … }

by checking block.arity in the implementation code.

I think that’s not necessary. You can just pass an Array to get the
same behavior as Hash#each:

irb(main):002:0> def f1; yield [1,2] end
=> nil
irb(main):003:0> f1 {|a| p a}
[1, 2]
=> [1, 2]
irb(main):004:0> f1 {|a,b| p a}
1
=> 1
irb(main):005:0> {1=>2}.each {|a,b| p a}
1
=> {1=>2}
irb(main):006:0> {1=>2}.each {|a| p a}
[1, 2]
=> {1=>2}
irb(main):007:0>

Ok. Thanks.

But according to that logic all collections returned should be
Arrays. So out = {} would become out = [].

Enumerable#map returns an Array on any object except wen the object is a
Hash: in such case it returns another Hash. And so should do #deep_map.

The difference is that in a Hash or Array values are usually uniform
while with a recursive structure it is much more likely that they are
not. So in the case of Enumerable#map you typically know what objects
you map while in the recursive case you rather need checks.

I still wonder about the usability of this.

Of course it depends by the usage scenario, and I actually found such a
need in one of my production projects. I may tell you longer detail if
you wish…

Of course there’s no need that the values belong to the same class or to
derived class, they just need to be uniform in the duck-typing sense.

Consider something like:

my_deep_structure.deep_map{|v| v.to_s}

any of this value may belong to a different class but still in
Your::Application::NameSpace::

This doesn’t appear so uncommon or unrealistic.

You may have designed your application so that any of your classes
respond to to_s or some other method which “exports” data.

The result of deep_map will not contain any reference to your
application logic and may easily “inter-operate” with others: I fond
this more convenient that overwriting to_json, to_yaml, marshal_dump and
so on…

G.

On Wed, Oct 6, 2010 at 6:05 PM, Guido De Rosa [email protected]
wrote:

       out[k.deep_map(&block)] = v.deep_map(&block)

I was not interested in mapping keys, but this would be a reasonable
extra feature.

OTOH you say you want to mimic #map behavior and since this is part of
the standard behavior people might expect to be able to map keys as
well if this goes into a library.

object.deep_map{|x| … }

by checking block.arity in the implementation code.

I think that’s not necessary. You can just pass an Array to get the
same behavior as Hash#each:

irb(main):002:0> def f1; yield [1,2] end
=> nil
irb(main):003:0> f1 {|a| p a}
[1, 2]
=> [1, 2]
irb(main):004:0> f1 {|a,b| p a}
1
=> 1
irb(main):005:0> {1=>2}.each {|a,b| p a}
1
=> {1=>2}
irb(main):006:0> {1=>2}.each {|a| p a}
[1, 2]
=> {1=>2}
irb(main):007:0>

You could remove lines “out = …” and replace them by

out = self.class.new

Good point… apparently!

What if self is a Range?

Good point.

Forcing to return an Array (or a Hash) is not so bad: even the standard
method Enumerable#map does so! And so does Enumerable#sort and many
others: it’s just to avoid nonsensical situations :slight_smile:

But according to that logic all collections returned should be
Arrays. So out = {} would become out = [].

when Fixnum

I just want to write the recursive version of some Ruby core methods,
resembling by many aspects pretty much the same behavior, including the
fact that some checks are up to the user :wink:

The difference is that in a Hash or Array values are usually uniform
while with a recursive structure it is much more likely that they are
not. So in the case of Enumerable#map you typically know what objects
you map while in the recursive case you rather need checks.

I still wonder about the usability of this.

Kind regards

robert

On Thu, Oct 7, 2010 at 10:22 AM, Guido De Rosa [email protected]
wrote:

Robert K. wrote:

But according to that logic all collections returned should be
Arrays. So out = {} would become out = [].

Enumerable#map returns an Array on any object except wen the object is a
Hash: in such case it returns another Hash. And so should do #deep_map.

11:04:25 ~$ allruby -e ‘p({1=>2}.map {|a,b| [b,a]}.class)’
CYGWIN_NT-5.1 padrklemme2 1.7.7(0.230/5/3) 2010-08-31 09:58 i686 Cygwin

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
Array

ruby 1.9.1p429 (2010-07-02 revision 28523) [i386-cygwin]
Array

jruby 1.4.0 (ruby 1.8.7 patchlevel 174) (2009-11-02 69fbfa3) (Java
HotSpot™ Client VM 1.6.0_21) [x86-java]
Array

I see you noticed already.

The difference is that in a Hash or Array values are usually uniform
while with a recursive structure it is much more likely that they are
not. So in the case of Enumerable#map you typically know what objects
you map while in the recursive case you rather need checks.

I still wonder about the usability of this.

Of course it depends by the usage scenario, and I actually found such a
need in one of my production projects. I may tell you longer detail if
you wish…

Yes, please.

Of course there’s no need that the values belong to the same class or to
derived class, they just need to be uniform in the duck-typing sense.

Consider something like:

my_deep_structure.deep_map{|v| v.to_s}

What do you need that for? If it is for printing or logging then you
can use my_deep_structure.to_s or my_deep_structure.inspect.

this more convenient that overwriting to_json, to_yaml, marshal_dump and
so on…

You really make me curious about your use case.

Cheers

robert

Guido De Rosa wrote:

Robert K. wrote:

But according to that logic all collections returned should be
Arrays. So out = {} would become out = [].

Enumerable#map returns an Array on any object except wen the object is a
Hash: in such case it returns another Hash. And so should do #deep_map.

Ouch! It actually returns an Array of key-value pairs. Sorry for the
mistake. I will consider you point, then :wink: .

G.

Robert K. wrote:

Guido De Rosa wrote:

Of course it depends by the usage scenario, and I actually found such a
need in one of my production projects. I may tell you longer detail if
you wish…

Yes, please.

Of course there’s no need that the values belong to the same class or to
derived class, they just need to be uniform in the duck-typing sense.

Consider something like:

my_deep_structure.deep_map{|v| v.to_s}

What do you need that for? If it is for printing or logging then you
can use my_deep_structure.to_s or my_deep_structure.inspect.

this more convenient that overwriting to_json, to_yaml, marshal_dump and
so on…

You really make me curious about your use case.

This is embarrassing but I’m thinking - right now - that I made some
other mistakes in “my use case”. Thanks for the head-ups, this
discussion has proven useful.

What I will do, if you’re still curious, is just provide a link to my
fixes (as git commits, when they are done).

Now, don’t mind my use case. Let’s return to my first post.

My conclusion is that the method I wrote should be renamed
deep_map_values. And another method may be called deep_rekey with
obvious behavior.

Hash#map is totally another beast:

  1. it mixes key and values in the transformation (which may lead to
    unwanted results in a hypothetical recursive version IMHO)

  2. it turns an Hash into an Array of pairs (which may be what you
    want, or not)

Regards,
Guido

Perhaps what is being sought here is a form of search and replace.

class Array
def deep_replace(replacement, &match)
map do |e|
if e.respond_to?(:deep_replace)
e.deep_replace(replacement, &match)
else
match[e] ? replacement : e
end
end
end

require ‘facets/hash/mash’ #[1]

class Hash
def deep_replace(replacement, &match)
mash do |k,v|
nk = if k.respond_to?(:deep_replace)
k.deep_replace(replacement, &match)
else
match[k] ? replacement : k
end
nv = if v.respond_to?(:deep_replace)
v.deep_replace(replacement, &match)
else
match[v] ? replacement : v
end
[nk, nv]
end
end

Not tested but you get the idea.

I’m not 100% sure about handling the key for Hash either, but perhaps
an option could toggle that usage on or off.

[1]http://blog.thepete.net/2010/02/ruby-facets-mash-method.html

Thomas S. wrote:

Perhaps what is being sought here is a form of search and replace.

class Array
def deep_replace(replacement, &match)
map do |e|
if e.respond_to?(:deep_replace)
e.deep_replace(replacement, &match)
else
match[e] ? replacement : e
end
end
end

require ‘facets/hash/mash’ #[1]

class Hash
def deep_replace(replacement, &match)
mash do |k,v|
nk = if k.respond_to?(:deep_replace)
k.deep_replace(replacement, &match)
else
match[k] ? replacement : k
end
nv = if v.respond_to?(:deep_replace)
v.deep_replace(replacement, &match)
else
match[v] ? replacement : v
end
[nk, nv]
end
end

Not tested but you get the idea.

I’m not 100% sure about handling the key for Hash either, but perhaps
an option could toggle that usage on or off.

[1]http://blog.thepete.net/2010/02/ruby-facets-mash-method.html

That way you can say for example:

Here's a deeply nested object; replace any "content" n which is a
Fixnum with the number 123

But you cannot say:

Here's a deeply nested object; replace any "content" n which is a
Fixnum with the number n + 1

#deep_replace would be a lot more flexible if replacement could be a
lambda.

The drawback is that client code would look ugly:

o.deep_replace(lambda {|x| x+1}) {|y| ...}

Is there an elegant way o pass two blocks of code?

Maybe a more consistent API would look like:

o.deep_replace(
  lambda {|x| ... }, # matching conditions
  lambda {|x| ... }  # replacement code
)

or like:

o.deep_replace(
  lambda {|k, v| ...                }, # matching conditions
  lambda {|k, v| ... [new_k, new_v] }  # replacement code
)

But more simply the “matching” may be done by the client code:

o.deep_replace do |k, v|
if matching_conditions(k, v)
# do the replecement

[new_k, new_v]
else
[k, v] # leave the same
end
end

so, again, we have to pass one block of code, but for the replacement,
not the match.

G.

On Oct 7, 1:18 pm, Guido De Rosa [email protected] wrote:

#deep_replace would be a lot more flexible if replacement could be a
lambda.

The drawback is that client code would look ugly:

o.deep_replace(lambda {|x| x+1}) {|y| ...}

Is there an elegant way o pass two blocks of code?

Indeed there is. In fact that is something akin to how the new Facets
#recursively method works.

arr = [“a”, [“b”, “c”]]
arr.recursively{ |a| a.reverse }.map{ |v| v.to_sym }
#=> [:a, [:c, :b]]

The first block handles enumerables and the second handles non-
enumerable elements.

(See
facets/lib/core/facets/array/recursively.rb at main · rubyworks/facets · GitHub)

But I am not sure #recursively can help in the search-and-replace
case.

This leads me to wonder about a more general API, e.g.

def match(&match)
Enumerable::Matcher.new(&match)
end

And then Enumerable::Matcher could have different methods for what to
do with the various matches, such as #replace or #delete.

  lambda {|k, v| ...                }, # matching conditions
 else
   [k, v] # leave the same
 end

end

so, again, we have to pass one block of code, but for the replacement,
not the match.

But we still have the problem of matching |k,v| for Hashes but |e| for
Arrays.