Iterating over an Array of Hashes

All,

When iterating over an Array of Hashes repeatedly, the ‘each’ method
appear to pass a reference to - not a copy of - each element in the
array.

Is there a way to take a copy of the data, rather than a reference to
the data, so that any changes to each hash aren’t?

Explaining it another way - I have code similar to the following:

a = [ ‘2011-01-01’, ‘2011-01-02’, ‘2011-01-03’ ]
b = [ { :x => “foo” }, { :x => “bar” }, { :x => “baz” } ]

a.each do |a_item|

puts a_item

b.each do |b_item|
  b_item[:date] = a_item
  # DO SOMETHING WITH B
end

end

What I want to do within the b.each loop is work on a copy of each
element of b, such that if I mess around with it, the changes are lost
when when the loop exits.

What is the magic I am missing?

Peter

On Wed, May 04, 2011 at 07:58:04AM +0900, Peter H. wrote:

What I want to do within the b.each loop is work on a copy of each
element of b, such that if I mess around with it, the changes are lost
when when the loop exits.

I think this might do what you want:

Marshal.load(Marshal.dump b).each |b_item|
  # do stuff
end

It’s ugly and kludgey, but it should work.

hi Peter -

well, this is certainly no magic - but you could just make a temp
array, and check each entry to make sure you like it before committing
it to “b.”
there’s probably a better way to do this, but here’s what i came up
with:

a = [ ‘2011-01-01’, ‘2011-01-02’, ‘2011-01-03’ ]
b = [ { :x => “foo” }, { :x => “bar” }, { :x => “baz” } ]
temp = []

a.each{|a_item| temp << {:date => a_item}}

temp.each{|entry|
unless entry.inspect.include?(“02”) #or something else more relevant
idx = temp.index(entry)
b[idx] = entry
end
}

temp = []

p b

#=> [{:date=>“2011-01-01”}, {:x=>“bar”}, {:date=>“2011-01-03”}]

  • j

Peter H. wrote in post #996483:

What is the magic I am missing?

clone():

b = [ { :x => “foo” }, { :x => “bar” }, { :x => “baz” } ]
count = 0

b.each do |hash|
hash_copy = hash.clone
hash_copy[:x] = count
count += 1

p hash_copy
end

b.each do |hash|
p hash
end

–output:–
{:x=>0}
{:x=>1}
{:x=>2}
{:x=>“foo”}
{:x=>“bar”}
{:x=>“baz”}

John F. wrote in post #996498:

On Tue, May 3, 2011 at 20:34, 7stud – [email protected] wrote:

Peter H. wrote in post #996483:

What is the magic I am missing?

clone():

clone doesn’t cut it, since it’s creating shallow copies. An
illustration of the problem:

clone() cuts just fine. A shallow copy is all the op needs.

Your example doesn’t contain nested hashes, while mine does. That’s
what I was demonstrating – it’s a shallow copy, not a deep one.

~ jf

John F.
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

Any problem with the marshal dump and load aproach of Chad P.?

Look…

arr = [{:fruit => {:kind => ‘apple’}}, {:fruit => {:kind => ‘banana’}}]

Marshal.load(Marshal.dump arr).each do |h|
h[:fruit][:kind] = ‘coconut’
end

If it’s ugly, you can try to beautify it…

class Object
def deep_copy
Marshal.load(Marshal.dump self)
end
end

arr.deep_copy.each do |h|
h[:fruit][:kind] = ‘coconut’
end

Abinoam Jr.

On Tue, May 3, 2011 at 20:34, 7stud – [email protected] wrote:

Peter H. wrote in post #996483:

What is the magic I am missing?

clone():

clone doesn’t cut it, since it’s creating shallow copies. An
illustration of the problem:

==== begin snippet ====
arr = [{:fruit => {:kind => ‘apple’}}, {:fruit => {:kind => ‘banana’}}]

=> [{:fruit=>{:kind=>“apple”}}, {:fruit=>{:kind=>“banana”}}]

arr.map(&:clone).each { |h| h[:fruit][:kind] = ‘coconut’ }

=> [{:fruit=>{:kind=>“coconut”}}, {:fruit=>{:kind=>“coconut”}}]

==== end snippet ====

Notice that arr now has coconuts in it, instead of the original
apples and bananas.

~ jf

John F.
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On Wed, May 04, 2011 at 12:28:18PM +0900, Christopher D. wrote:

On Tue, May 3, 2011 at 7:08 PM, John F. [email protected]
wrote:

Your example doesn’t contain nested hashes, while mine does.

Neither did Peter’s example and request.

His example was an array with nested hashes – that is, hashes nested in
an array. It was not hashes nested in hashes, but I do not think that
was what John meant anyway.

that simply using #dup or #clone won’t work.
I went back and read the original request. While he did not use the
words “deep copy”, the implication of his request seemed pretty clearly
to ask for exactly that, in my estimation. He referred to an array of
hashes; he referred to things being “references” rather than “copies”
(the more formal terminology for those with a pedantic bent would be
“reference copies rather than value copies”); and he referred to the
desire to be able to operate on his data structure in a loop, making
changes, and have those changes lost when the loop exits rather than
saved in the data structure that existed before copying.

The implications of these requirements add up to a request for a way to
get a deep copy.

OTOH, if Peter needs to be able to handle arbitrary Ruby objects in the
hashes, using Marshal.load(Marshal.dump(whatever)) won’t work, since
there are objects that can’t be dumped.

It meets the need of the example data for the requirements indicated
above. I am not aware of any (relatively trivial) solution to this much
broader problem you’re suggesting.

What if the values in the hash are lambdas?

Then yeah, the Marshal dump/load approach doesn’t work. It does work
for the presented example, though, whereas (given the implications of
the
requests in the original questions) your dup-or-clone approach does
not
work.

The load/dump mechanism meets some other need, but doesn’t meet the
original request for hashes with arbitrary data (and is an
extraordinarily convoluted mechanism for meeting the original request
even where it works.)

  1. It meets the needs of the presented example data.

  2. Is there some less-convoluted approach that works for the more
    complex
    needs you presented?

  3. How exactly does this make a dup-or-clone approach work such as what
    you presented work?

Its possible to craft a generic deep-copying mechanism, if you need it,
but Marshal.load(Marshal.dump(whatever)) isn’t it.

It is not clear to me “generic deep-copying” is exactly what’s needed.

Hi,
I’m running a pc with Windows Vista…
My email flags your emails with ‘phishing’ warning…
Thought you might want to know…
Good day

----- Original Message -----
From: “Chad P.” [email protected]
To: “ruby-talk ML” [email protected]
Sent: Wednesday, May 04, 2011 11:18 AM
Subject: Re: Iterating over an Array of Hashes

On Tue, May 3, 2011 at 7:08 PM, John F. [email protected]
wrote:

Your example doesn’t contain nested hashes, while mine does.

Neither did Peter’s example and request.

That’s what I was demonstrating – it’s a shallow copy, not a deep
one.

Simply using #dup or #clone meets the original request as presented,
and does so for arbitrary Ruby objects in the hashes.

If the Peter needs a deep copy, which Peter didn’t ask for (just a
copy of the data in each hash in the array – not nested hashes – so
that the originals in the area wouldn’t be modified on iteration) its
true that simply using #dup or #clone won’t work.

OTOH, if Peter needs to be able to handle arbitrary Ruby objects in
the hashes, using Marshal.load(Marshal.dump(whatever)) won’t work,
since there are objects that can’t be dumped.

What if the values in the hash are lambdas?

The load/dump mechanism meets some other need, but doesn’t meet the
original request for hashes with arbitrary data (and is an
extraordinarily convoluted mechanism for meeting the original request
even where it works.)

Its possible to craft a generic deep-copying mechanism, if you need
it, but Marshal.load(Marshal.dump(whatever)) isn’t it.

On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher D. wrote:

The #clone based method presented upthread (which wasn’t mine, it was
7stud’s) works perfectly for the data and requirements presented. It
doesn’t work if, in addition to wanting to lose changes to the passed
hashes, one also wants to lose any changes resulting from manipulation
of the data in the hashes, but that wasn’t requested.

Are you seriously claiming that the person’s statements seemed to you to
indicate a desire for the data to change in the original data structure?

Seriously?

That is, since you work on a local copy of each hash, mutating methods
called on the hash won’t have any lasting effect, but mutating methods
called on keys or values extracted from the (local copy of the) hash
will affect the objects stored in the original hash. Avoiding this kind
of effect was not requested.

It really seemed implied from where I was sitting.

Sure, but its way overkill for it. You only need deep copies if you
have requirements that weren’t stated (avoiding effects from mutating
methods called not on the hashes being passed but on the keys or values
of the hashes.)

I’m not sure why you’re so pedantically splitting hairs when the actual
intent seemed pretty obvious: no changes.

  1. Is there some less-convoluted approach that works for the more
    complex needs you presented?

I’m not sure what relevance this has.

I think you’re playing dumb.

What if the values in the hash are lambdas?

Then yeah, the Marshal dump/load approach doesn’t work. It does work
for the presented example, though, whereas (given the implications of the
requests in the original questions) your dup-or-clone approach does not
work.

The #clone based method presented upthread (which wasn’t mine, it was
7stud’s) works perfectly for the data and requirements presented. It
doesn’t work if, in addition to wanting to lose changes to the passed
hashes, one also wants to lose any changes resulting from manipulation
of the data in the hashes, but that wasn’t requested.

That is, since you work on a local copy of each hash, mutating methods
called on the hash won’t have any lasting effect, but mutating methods
called on keys or values extracted from the (local copy of the) hash
will affect the objects stored in the original hash. Avoiding this
kind of effect was not requested.

The load/dump mechanism meets some other need, but doesn’t meet the
original request for hashes with arbitrary data (and is an
extraordinarily convoluted mechanism for meeting the original request
even where it works.)

  1. It meets the needs of the presented example data.

Sure, but its way overkill for it. You only need deep copies if you
have requirements that weren’t stated (avoiding effects from mutating
methods called not on the hashes being passed but on the keys or
values of the hashes.)

  1. Is there some less-convoluted approach that works for the more complex
    needs you presented?

I’m not sure what relevance this has. A deep copying solution that
handles arbitrary rather than merely serializable objects at the
fringes is going to be more complex than
Marshal.load(Marshal.dump(…)), but since either approach is more
convoluted than what is needed here, I don’t see why that matters here
(and, since Marshal.load(Marshal.dump(…)) doesn’t work for the cases
where you’d need the more sophisticated deep copying solution, I don’t
see why the fact that the latter would be more complex would matter
there, either – a simpler approach that doesn’t work is still no
solution at all.)

  1. How exactly does this make a dup-or-clone approach work such as what
    you presented work?

“this” doesn’t make the dup-or-clone approach work, it is orthogonal
to the fact that the dup-or-clone approach, as presented (in #clone
form) by 7stud upthread does work for the scenario presented.

On Wed, May 4, 2011 at 1:50 PM, Chad P. [email protected] wrote:

On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher D. wrote:

The #clone based method presented upthread (which wasn’t mine, it was
7stud’s) works perfectly for the data and requirements presented. It
doesn’t work if, in addition to wanting to lose changes to the passed
hashes, one also wants to lose any changes resulting from manipulation
of the data in the hashes, but that wasn’t requested.

Are you seriously claiming that the person’s statements seemed to you to
indicate a desire for the data to change in the original data structure?

No, I’m claiming that the original requester’s statement didn’t
indicate a need to protect against changes resulting from calling
mutating methods on the keys or values of the hashes, only protection
from changes resulting from mutations on the hash.

That could be because changes to things held in the hash were supposed
to be propagated, or it (perhaps more likely) could be because the
code in which it was to be used wasn’t going to be calling mutating
methods on the keys or values from the hash in any case.

In the former case, a deep copy would be wrong, in the latter case it
would merely be unnecessary.

2 solutions have been presented. I think the original poster’s question
was
sufficiently vague to make both of the solutions valid.

So, original poster: what were your intentions? Did you need the only
the
hashes in the array preserved, or did you ALSO need the contents of the
hashes in the array preserved? If the former, then clone/dup is fine.
If
the latter, then you need some kind of deep copy. The “simplest” deep
copy
I can think of is the Marshal#dump/load one.

Saludos,
Doug

On Thu, May 05, 2011 at 06:21:00AM +0900, Christopher D. wrote:

indicate a desire for the data to change in the original data structure?

No, I’m claiming that the original requester’s statement didn’t
indicate a need to protect against changes resulting from calling
mutating methods on the keys or values of the hashes, only protection
from changes resulting from mutations on the hash.

He didn’t specify “only protection from changes resulting from mutations
on the hash.” He said he didn’t want his actions to change his data
structure, in a very general way. Drawing distinctions between the data
itself and the “containers” in which the data resides seems like a case
of inventing complexity in the original request that were not indicated.

That could be because changes to things held in the hash were supposed
to be propagated, or it (perhaps more likely) could be because the code
in which it was to be used wasn’t going to be calling mutating methods
on the keys or values from the hash in any case.

In the former case, a deep copy would be wrong, in the latter case it
would merely be unnecessary.

. . . and in the general “I don’t want to permanently change anything”
case, which seems most likely, it would be necessary.

On Wed, May 4, 2011 at 3:17 PM, Chad P. [email protected] wrote:

Are you seriously claiming that the person’s statements seemed to you to
indicate a desire for the data to change in the original data structure?

No, I’m claiming that the original requester’s statement didn’t
indicate a need to protect against changes resulting from calling
mutating methods on the keys or values of the hashes, only protection
from changes resulting from mutations on the hash.

He didn’t specify “only protection from changes resulting from mutations
on the hash.”

Yes, which is why I used the word “indicated” rather than “specified”;
those words have very different meanings.

I certainly think that, as at one commenter has stated, the original
request was sufficiently ambiguous to support different readings. I
think we’ve addressed the relative utility of the various options that
have been present sufficiently that the OP (or others with similar
concerns) can make up their minds about what approach is best for
their use cases, or ask cogent follow-up questions if they need more
information.

Clearly, we disagree on what the best interpretation of the original
posters requirements is, but I think we are clearly past the point
where further discussion of that disagreement has any value for anyone
reading.

On Thu, May 05, 2011 at 06:38:57AM +0900, Douglas S. wrote:

2 solutions have been presented. I think the original poster’s question was
sufficiently vague to make both of the solutions valid.

So, original poster: what were your intentions? Did you need the only the
hashes in the array preserved, or did you ALSO need the contents of the
hashes in the array preserved? If the former, then clone/dup is fine. If
the latter, then you need some kind of deep copy. The “simplest” deep copy
I can think of is the Marshal#dump/load one.

Does anyone want to make a bet with me? I bet $10 he says he doesn’t
want the actual data in the hash to change, if he comes back and says
anything one way or the other. My first clue is where he starts out by
talking about copying the data – using the word “data” literally –
without simply creating a reference to the original data.

On Thu, May 05, 2011 at 07:42:58AM +0900, Christopher D. wrote:

Yes, which is why I used the word “indicated” rather than “specified”;
those words have very different meanings.

What he specified was “data”. Read it again if you like. It was
“data”,
and not “programming abstractions”. I’m not aware of any definition of
“data” that means “not the actual data, but rather the programming
abstractions by which the data is managed”.

I eagerly await a response from the guy who posed the question.

On Thu, May 05, 2011 at 01:35:47AM +0900, Patrick L. wrote:

I’m running a pc with Windows Vista…
My email flags your emails with ‘phishing’ warning…
Thought you might want to know…

Thank you.

I’m trying to get this resolved with a service provider. Apparently the
provider’s mail server IP address has ended up on a blacklist or two for
some reason completely unrelated to me.

Just in case your phishing warning is unrelated to that, though – are
you sure it’s not related to the fact that my emails are digitally
signed? I’ve noticed that MS Windows users sometimes have problems with
digital signature attachments being marked as malware or otherwise
misidentified as some kind of threat.