Removing Duplicate Objects from Object List

Greetings all.

Does anyone have a good idea of how to write a loop that checks if two
objects are equal? By “equal” here I refer to the ‘eql’ method, to test
if
the objects have the same value.

I have set of Rule objects that will be stored in a RuleList object. I
know
how to cycle through the RuleList. I’m just doing this:

$ruleList.selection.each { |rule|

}

The problem is that I need to go through each rule and check if it is
equal
to any of the other rules that are in the list. If a duplicate is
found,
one of the duplicate rules should be removed.

Every solution I’ve tried has ended up either removing objects
incorrectly
or not finding the duplicates in the first place.

Here’s an example of some Rule objects:

<Tendent::Rule:0x2d5f7a8 @filter=“first after 202G_OrdAdd”,
@value=“203G_OrdUpdateFirst”, @point=“203G_OrdUpdateFirst”>
<Tendent::Rule:0x2d5f71c @filter=“last”, @value=“203G_OrdUpdateLast”,
@point=“203G_OrdUpdateLast”>
<Tendent::Rule:0x2d5f6a4 @filter=“first after 202G_OrdAdd”,
@value=“203G_OrdUpdateFirst”, @point=“203G_OrdUpdateFirst”>
<Tendent::Rule:0x2d5f62c @filter=“last”, @value=“203G_OrdUpdateLast”,
@point=“203G_OrdUpdateLast”>

Here is what they look like as strings:

203G_OrdUpdateFirst, first after 202G_OrdAdd, 203G_OrdUpdateFirst
203G_OrdUpdateLast, last, 203G_OrdUpdateLast
203G_OrdUpdateFirst, first after 202G_OrdAdd, 203G_OrdUpdateFirst
203G_OrdUpdateLast, last, 203G_OrdUpdateLast

So as you can see, I have four rules, but actually only two are unique.
(That just happens to be the case here. In other cases, perhaps there
will
be six rules, and two will be unique.)

Can anyone see an efficient way to do this?

Is it better to just convert these into an array? I know the Array class
has
the ‘uniq’ method. The problem is that I would still need the rules to
then
be objects as well. In other words, even if I put all the objects in an
array and modify the array, I would need to reflect the changes in the
object list itself, such that the duplicate objects no longer exist.

  • Jeff

On Oct 9, 12:16 pm, “Jeff Nyman” <[email protected]_gmail.com>
wrote:


<Tendent::Rule:0x2d5f7a8 @filter=“first after 202G_OrdAdd”,
203G_OrdUpdateFirst, first after 202G_OrdAdd, 203G_OrdUpdateFirst
Is it better to just convert these into an array? I know the Array class has
the ‘uniq’ method. The problem is that I would still need the rules to then
be objects as well. In other words, even if I put all the objects in an
array and modify the array, I would need to reflect the changes in the
object list itself, such that the duplicate objects no longer exist.

  • Jeff

Jeff,

How are you storing the Rules in your RuleSet at the moment? Personally
I’d use an Array (or simply subclass Array) and then you get to use
Array.uniq without shifting objects back and forth.

Stephen

“gaspode” [email protected] wrote in message
news:[email protected]

How are you storing the Rules in your RuleSet at the moment? Personally
I’d use an Array (or simply subclass Array) and then you get to use
Array.uniq without shifting objects back and forth.

Essentially, I have a RuleList class like this:

class RuleList def initialize @rules = Array.new end

def append(this_rule)
@rules.push(this_rule)
end

def selection
@rules.find_all { |rule| rule }
end
end

Then I have a Rule class like this:

class Rule attr_accessor :point, :filter, :value

def initialize(point, filter, value)
@point = point
@filter = filter
@value = value
end

def to_s
“#@point, #@filter, #@value
end
end

When a rule object needs to be added to the list, I do this:

$ruleList.append(Rule.new(step.point2, rule, value))

Does that give enough detail?

In playing around a bit more, I tried this:

rules_array = $ruleList.selection.collect { |rule| rule }

Then I tried:

rules_array.uniq!

The problem is that this finds nothing as a duplicate. But that makes
sense
(I think) because the object ID is probably being considered as part of
the
test and those will, of course, not be duplicates.

It sounds like you’re saying it would be better to not use a Rule class
in
the first place. Is that accurate?

  • Jeff

On Oct 9, 2006, at 1:50 PM, Jeff Nyman wrote:

The problem is that I need to go through each rule and check if it
is equal
to any of the other rules that are in the list. If a duplicate is
found,
one of the duplicate rules should be removed.

When a rule object needs to be added to the list, I do this:

the first place. Is that accurate?
You should implement #eql? and #hash methods on your class, and store
all instances in a [Set](http://ruby-doc.org/stdlib/libdoc/set/rdoc/
classes/Set.html).

require “set”

class Rule
attr_accessor :point, :filter, :value

def initialize(point, filter, value)
@point = point
@filter = filter
@value = value
end

def eql?(rule)
rule.point.eql?(@point) &&
rule.filter.eql?(@filter) &&
rule.value.eql?(@value)
end

def hash
@point.hash + @filter.hash + @value.hash
end
end

rules_set = Set.new
rules_set << Rule.new(1, 1, 1)
rules_set << Rule.new(1, 1, 1) # duplicate rule
rules_set << Rule.new(1, 1, 2)
rules_set.size # => 2
rules_set # => #<Set: {#<Rule:0x89d78 @value=2, @filter=1, @point=1>,
#<Rule:0x89db4 @value=1, @filter=1, @point=1>}>

– Daniel

Daniel N wrote:

If instead of declaring your @rules as an array, you declare it as a set
you
will get no duplicates for free (I think)

yes but you will loose order (if that is important)

However, you need to incorporate the <=> operator in your Rule class to
tell ruby how your objects relate to each other.
ie are they <, >, or =

That won’t do the trick for uniq as uniq is using a hash internally do
find duplicates. You have to define #hash and #eql? for this to work.
(or was it #hash and #== ?)

cheers

Simon

On Oct 9, 12:45 pm, “Jeff Nyman” <[email protected]_gmail.com>
wrote:

“gaspode” [email protected] wrote in messagenews:[email protected]

How are you storing the Rules in your RuleSet at the moment? Personally
I’d use an Array (or simply subclass Array) and then you get to use
Array.uniq without shifting objects back and forth.

Essentially, I have a RuleList class like this:

> > When a rule object needs to be added to the list, I do this: > > $ruleList.append(Rule.new(step.point2, rule, value)) > > Does that give enough detail? >

Plenty

test and those will, of course, not be duplicates.
The reason that it isn’t working as you expect is that the uniq method
uses eql?, which in turn uses the hash method (I think, somebody
correct me if I’m full of it). If you implement the hash method (to
return the same value for identical Rules) in your Rule class, this
should work fine.

It sounds like you’re saying it would be better to not use a Rule class in
the first place. Is that accurate?

No, your current Rule class is good. Just implement hash!

  • Jeff

Rather than doing:

rules_array = $ruleList.selection.collect { |rule| rule }
rules_array.uniq!

you could add a uniq and uniq! method to your RuleList that just
delegates the work to the underlying Array

def uniq @rules.uniq end

def uniq!
@rules.uniq!
end

If it is the case that you NEVER want the same Rule in there twice,
just do the check in the append method (also after implementing the
hash method)

def append(this_rule) @rules.push(this_rule) unless @rules.include?(this_rule) end

Thank you to all of you!

With everything said here, I definitely have this working now. Not only
that, but I learned a lot more about hash and Set.

(Just when you think you have a grasp of Ruby, you find you were only at
the
tip of the iceberg …)

  • Jeff

On 10/9/06, Jeff Nyman <[email protected]_gmail.com> wrote:

@rules.find_all { |rule| rule } def initialize(point, filter, value)

Then I tried:
the first place. Is that accurate?

  • Jeff

If instead of declaring your @rules as an array, you declare it as a set
you
will get no duplicates for free (I think)

However, you need to incorporate the <=> operator in your Rule class to
tell ruby how your objects relate to each other.
ie are they <, >, or =

uniq is failing because, even though the attributes of each instance of
the rule is ‘eq’ to the other, the compared instances are different.

class Foo
def initialize(a,b)
@a = a
@b = b
end
end

x = [Foo.new(:a, :b), Foo.new(:c, :d), Foo.new(:a, :b)]
p x
p x.uniq

ruby tst.rb
[#<Foo:0x3b6128 @a=:a, @b=:b>, #<Foo:0x3b6114 @a=:c, @b=:d>,
#<Foo:0x3b6100 @a=:a, @b=:b>]
[#<Foo:0x3b6128 @a=:a, @b=:b>, #<Foo:0x3b6114 @a=:c, @b=:d>,
#<Foo:0x3b6100 @a=:a, @b=:b>]

I tried defining eq? and hash and uniq still fails. hash returns
identical values for objects with identical content and eq? returns
true in this case, but uniq does not remove them.

Probably the right thing to do is to write a couple of loops.

On Wed, 11 Oct 2006, Mike wrote:

x = [Foo.new(:a, :b), Foo.new(:c, :d), Foo.new(:a, :b)]
identical values for objects with identical content and eq? returns
true in this case, but uniq does not remove them.

harp:~ > cat a.rb
class Foo
ATTRIBUTES = %w( a b )
ATTRIBUTES.each{|at| attr at}

def initialize(a,b) @a, @b = a, b end
def parts() ATTRIBUTES.map{|at| send at} end
def eql?(other) parts == other.parts end
def hash() parts.hash end
end

p [ Foo.new(:a, :b), Foo.new(:c, :d), Foo.new(:a, :b) ]
p [ Foo.new(:a, :b), Foo.new(:c, :d), Foo.new(:a, :b) ].uniq

harp:~ > ruby a.rb
[#<Foo:0xb75d137c @a=:a, @b=:b>, #<Foo:0xb75d1368 @a=:c, @b=:d>,
#<Foo:0xb75d1340 @a=:a, @b=:b>]
[#<Foo:0xb75d1214 @a=:a, @b=:b>, #<Foo:0xb75d1200 @a=:c, @b=:d>]

-a

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs