# A small problem for arrays

I have 2 array. ar_1, ar_2
How can we, on the basis of existing arrays to create an array ar_3, of
all the elements that are ar_1 and are not included in ar_2.

2010/8/21 Unc88 Unc88 [email protected]:

I have 2 array. ar_1, ar_2
How can we, on the basis of existing arrays to create an array ar_3, of
all the elements that are ar_1 and are not included in ar_2.

Just make the difference between the two arrays:

a1 = (1…10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a2 = (4…7).to_a
=> [4, 5, 6, 7]
a1 - a2
=> [1, 2, 3, 8, 9, 10]
a2 - a1
=> []

Cheers,

Can do the subtraction of arrays without using the operator “-”?
Is there a standard method in the class Array?

On Sun, Aug 22, 2010 at 01:00:54AM +0900, Unc88 Unc88 wrote:

Can do the subtraction of arrays without using the operator “-”?
Is there a standard method in the class Array?

The method is `-`, and using it as an infix operator is just syntactic
sugar. If you want it to look like standard .method syntax, do this:

``````> ar_3 = ar_1.-(ar_2)
=> [1, 2, 3, 8, 9, 10]
``````

Of course, the parentheses are optional:

``````> ar_3 = ar_1.- ar_2
=> [1, 2, 3, 8, 9, 10]``````

2010/8/21 Unc88 Unc88 [email protected]:

Can do the subtraction of arrays without using the operator “-”?
Is there a standard method in the class Array?

“-” is a standard method of class Array. The apparent “operator”
behavior is just syntaxic sugar added on it:

a1 = (1…10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
a2 = (5…7).to_a
=> [5, 6, 7]
a1.-(a2)
=> [1, 2, 3, 4, 8, 9, 10]

Last line call “-” method of object “a1” with object “a2” as an
argument.

~> ri Array.-

## ---------------------------------------------------------------- Array#- array - other_array → an_array

`````` Array Difference---Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

[ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ]  #=>  [ 3, 3, 5 ]
``````

Cheers,

On 21.08.2010 18:27, Jean-Julien F. wrote:

=> [5, 6, 7]
Array Difference—Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

``````     [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ]  #=>   [ 3, 3, 5 ]
``````

Cheers,

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

Kind regards

robert

On Mon, Aug 30, 2010 at 5:45 PM, Ruby U. Ruby U. [email protected]
wrote:

– operations needed then using class Set might yield better
performance.
I do’t much understand what you mean. If not hard can give you an
example…

It means that there are some operations that are more efficient in Set
than in Array, and that if you need a lot of those, it would be better
to use Set instead. For example, the intersection of two Sets is
faster than the intersection of two Arrays:

require ‘benchmark’
require ‘set’

n = 1_000

a1 = (1…10_000).map {|x| rand(100)}
a2 = (1…10_000).map {|x| rand(100)}
s1 = Set.new.merge a1
s2 = Set.new.merge a2

Benchmark.bmbm do |x|
x.report(“array minus”) do
n.times {a1 - a2}
end
x.report(“set &”) do
n.times {s1 & s2}
end
end

\$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 0.900000 0.000000 0.900000 ( 0.935476)
set & 0.280000 0.070000 0.350000 ( 0.361684)
-------------------------------------- total: 1.250000sec

``````              user     system      total        real
``````

array minus 0.880000 0.010000 0.890000 ( 0.890552)
set & 0.280000 0.070000 0.350000 ( 0.353687)

Jesus.

2010/8/30 Jesús Gabriel y Galán [email protected]:

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

faster than the intersection of two Arrays:

require ‘benchmark’
require ‘set’

n = 1_000

a1 = (1…10_000).map {|x| rand(100)}
a2 = (1…10_000).map {|x| rand(100)}
s1 = Set.new.merge a1
s2 = Set.new.merge a2

Here’s another (probably more efficient) way to write that:

a1 = Array.new(10_000) { rand(100) }
a2 = Array.new(10_000) { rand(100) }

s1 = a1.to_set
s2 = a2.to_set

It would probably be better to apply #uniq! on those Arrays (or do “a1
= s2.to_a” after set creation) to get collections with identical
sizes.

Rehearsal -----------------------------------------------
array minus 0.900000 0.000000 0.900000 ( 0.935476)
set & 0.280000 0.070000 0.350000 ( 0.361684)
-------------------------------------- total: 1.250000sec

``````             user     system      total        real
``````

array minus 0.880000 0.010000 0.890000 ( 0.890552)
set & 0.280000 0.070000 0.350000 ( 0.353687)

I’m sorry, but you are comparing apples and oranges here:

irb(main):001:0> a=[1,2,3]; b=[2,3,4]
=> [2, 3, 4]
irb(main):002:0> a & b
=> [2, 3]
irb(main):003:0> a.to_set & b.to_set
=> #<Set: {2, 3}>
irb(main):004:0> a - b
=> [1]
irb(main):005:0> a.to_set - b.to_set
=> #<Set: {1}>

Operators - and & do not do the same thing. But they behave identical
for Array and Set!

Kind regards

robert

Robert K. wrote:

On 21.08.2010 18:27, Jean-Julien F. wrote:

=> [5, 6, 7]
Array Difference—Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

``````     [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ]  #=>   [ 3, 3, 5 ]
``````

Cheers,

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

Kind regards

robert

– Just adding to that: if Arrays are large and / or there are frequent
set
– operations needed then using class Set might yield better
performance.
I do’t much understand what you mean. If not hard can give you an
example…

2010/8/31 Jesús Gabriel y Galán [email protected]:

I do’t much understand what you mean. If not hard can give you an
n = 1_000

Benchmark.bmbm do |x|
array minus 0.900000 0.000000 0.900000 ( 0.935476)
=> [2, 3, 4]
for Array and Set!
a1 = (1…5_000).sort_by { rand }
end
array minus 1.410000 0.010000 1.420000 ( 1.428664)
set minus 10.990000 3.070000 14.060000 ( 14.188415)

Could it be because Array is written in C, while Set is in Ruby
iterating over an Enumerable object?

Probably. Could be that your collections were not large enough to
show the advantage of Set or value distributions are unfortunate for
Set.

Please keep in mind that performance is not the only advantage of
using Set - it’s also the semantics to have each value only once in
the set and it helps documenting requirements.

Did I do something wrong again?

Not as far as I can see. I changed it a bit to look at different sizes:

require ‘benchmark’
require ‘set’

n = 100

Benchmark.bmbm 30 do |x|
size = 1000

while size < 100_000

``````a1 = (1..size).sort_by { rand }
a2 = ((size / 2)..(size / 2 + size)).sort_by { rand }
s1 = a1.to_set
s2 = a2.to_set

x.report("array minus #{size}") do
n.times {a1 - a2}
end

x.report("set minus #{size}") do
n.times {s1 - s2}
end

x.report("array & #{size}") do
n.times {a1 & a2}
end

x.report("set & #{size}") do
n.times {s1 & s2}
end

size *= 2
``````

end
end

Kind regards

robert

On Tue, Aug 31, 2010 at 2:34 PM, Robert K.
[email protected] wrote:

a1 = (1…10_000).map {|x| rand(100)}
s2 = a2.to_set

It would probably be better to apply #uniq! on those Arrays (or do “a1
= s2.to_a” after set creation) to get collections with identical
sizes.

Yes, as an afterthought it would have been better to build two arrays
for example (1…5000).to_a and (3000…8000).to_a and randomize them.

Rehearsal -----------------------------------------------
irb(main):001:0> a=[1,2,3]; b=[2,3,4]
Operators - and & do not do the same thing. But they behave identical
for Array and Set!

I totally brainfarted !!! The reviewed version, with surprising
results, at least for me: Set#- is less efficient than Array#- (unless
I’m doing something wrong again):

require ‘benchmark’
require ‘set’

n = 1_000

a1 = (1…5_000).sort_by { rand }
a2 = (3_000…8_000).sort_by { rand }
s1 = a1.to_set
s2 = a2.to_set

Benchmark.bmbm do |x|
x.report(“array minus”) do
n.times {a1 - a2}
end
x.report(“set minus”) do
n.times {s1 - s2}
end
end

\$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 1.370000 0.010000 1.380000 ( 1.398643)
set minus 10.880000 3.060000 13.940000 ( 14.100127)
------------------------------------- total: 15.320000sec

``````              user     system      total        real
``````

array minus 1.410000 0.010000 1.420000 ( 1.428664)
set minus 10.990000 3.070000 14.060000 ( 14.188415)

Could it be because Array is written in C, while Set is in Ruby
iterating over an Enumerable object? Did I do something wrong again?

Jesus.