SUMMARY

What’s the fastest and/or shortest you can turn this:

in = [“i”, “v”, “w”, “e”, “l”, “d”, “u”, “f”, “e”, “v”, “f”, “e”,

“d”, “e”, “w”, “d”]

into this:

[“i”, “l”, “u”, “v”, “f”, “e”, “w”, “d”]

?

DETAILS

The goal is, given a stream of items, to produce the same ordered

stream with no duplicates, where the relative position of the last item

wins. The shortest (and fastest?) for me turns out to be simply:

in.reverse.uniq.reverse

I have the question mark because the in-place version is sometimes

faster, sometimes slower in my tests:

out = in.dup; out.reverse!; out.uniq!; out.reverse!

I thought it might be faster to loop through the array in one pass

myself, but that turns out to be about 4x slower:

seen = {}

out = in.dup

out.each_with_index{ |v,i|

if n=h[v]

out[n] = nil

end

h[v] = i

}

out.compact!

The input for me isn’t REALLY an array, but rather a series of items

that I receive one at a time from a depth-first traversal of a graph.

If this influences your answer, so be it

BACKGROUND

Why is this problem mildly interesting? See

http://phrogz.net/nodes/traversingdirectedgraph.asp

I was reading that page for nostalgia the other day and thought about

porting the calculation library to Ruby for fun, and to see how much

easier solving the problem would be in Ruby.

Here’s the code I put together for fun (the cells and formulae in the

below correspond to the graphic on the above url), before adding the

‘minimal update chain’ optimization.

require ‘set’

module ClassContainer

def initialize( *args )

self.class[ self.name ] = self

super

end

def self.included( klass )

klass.class_eval{

def self.inherited( subklass )

( @subclasses ||= [] ) << subklass

end

```
def self.all
@subclasses ||= []
@instances ||= {}
all = @instances.dup
@subclasses.each{ |sub| all.merge!( sub.all ) }
all
end
def self.each
all.each_value{ |i| yield i }
end
def self.[]( name )
all[ name ]
end
def self.[]=( name, instance )
( @instances ||= {} )[ name ] = instance
end
def self.sorted
all.values.sort_by{ |n| n.name.to_s }
end
}
```

end

end

module GraphNode

attr_reader :outgoing_edges, :incoming_edges

def initialize( *args )

@outgoing_edges = Set.new

@incoming_edges = Set.new

end

def edges

@outgoing_edges + @incoming_edges

end

def add_edge_to( other_node )

@outgoing_edges << other_node

other_node.incoming_edges << self

self

end

def clear_incoming

@incoming_edges.each{ |n|

n.outgoing_edges.delete self

}

@incoming_edges = Set.new

self

end

def traverse( seen_nodes = Set.new, &block )

new_seen = seen_nodes.dup << self

@outgoing_edges.each{ |destination|

unless seen_nodes.include?( destination )

yield destination

destination.traverse( new_seen, &block )

end

}

end

end

class Cell

include GraphNode

include ClassContainer

attr_reader :name

attr_accessor :value

def initialize( name, value=0 )

raise “Duplicate ID” if Cell[ name ]

@name = name

@value = value * 1.0

```
super
```

end

def value=( new_value )

# Naive update approach

return if @value == new_value

@value = new_value

update_dependents

new_value

end

def update_dependents

traverse{ |formula|

formula.evaluate( true )

}

end

def to_s

‘<%s “%s” value=%.2f>’ % [ self.class.name, @name, @value ]

end

end

class Formula < Cell

def initialize( *args )

super

@formula, @value = @value.to_s, 0.0

update_dependencies

end

def update_dependencies

clear_incoming

@formula.scan( /@(\w+)/ ).flatten.each{ |source_name|

if incoming = Cell[ source_name.to_sym ]

incoming.add_edge_to( self )

end

}

end

def evaluate( skip_dependents = false )

scope = ValueSpace.new

Cell.each{ |n| scope.set( n.name, n.value ) }

@value = eval( @formula, scope.get_binding )

puts “Updated #{self}”

update_dependents unless skip_dependents

end

def to_s

‘<%s “%s” formula="%s" value=%.2f>’ % [ self.class.name, @name,

@formula, @value ]

end

end

class ValueSpace

def set( name, value )

instance_variable_set :"@#{name}", value

end

def get_binding

binding

end

end

if $0 == **FILE**

Cell.new( :a, 5 )

Cell.new( :d, 3 )

Cell.new( :e, 32 )

Cell.new( :i, 10 )

Cell.new( :l, 12 )

Formula.new( :b, “@d + @e + @a” )

Formula.new( :g, “@h * @i” ).value = 20

Formula.new( :h, “@g / @i” ).value = 2

Formula.new( :c, “@a + 3” )

Formula.new( :f, “@b + @c” )

Formula.new( :j, “@c + 8” )

Formula.new( :k, “@l + @j” )

Formula.new( :m, “@l + 10” )

# Need to call here to handle circular dependencies

Formula.each{ |f| f.update_dependencies }

# Naive approach…

puts “Naive initial evaluation…”

Formula.each{ |f| f.evaluate }

puts “*” * 40

puts “Initial values…”

puts Cell.sorted

puts “*” * 40

puts “Changing ‘a’ to 7…”

Cell[ :a ].value = 7

puts “*” * 40

puts “Changing ‘i’ to 36…”

Cell[ :i ].value = 36

puts ‘-’ * 40

puts “Per cell dependency…”

Cell.each{ |cell|

puts “#{cell.name} -> #{cell.outgoing_edges.map{|c| c.name}.join(’,

')}”

}

end

(which outputs)

Naive initial evaluation…

Updated <Formula “c” formula="@a + 3" value=8.00>

Updated <Formula “f” formula="@b + @c" value=8.00>

Updated <Formula “j” formula="@c + 8" value=16.00>

Updated <Formula “k” formula="@l + @j" value=28.00>

Updated <Formula “f” formula="@b + @c" value=8.00>

Updated <Formula “j” formula="@c + 8" value=16.00>

Updated <Formula “k” formula="@l + @j" value=28.00>

Updated <Formula “b” formula="@d + @e + @a" value=40.00>

Updated <Formula “f” formula="@b + @c" value=48.00>

Updated <Formula “k” formula="@l + @j" value=28.00>

Updated <Formula “g” formula="@h * @i" value=20.00>

Updated <Formula “h” formula="@g / @i" value=2.00>

Updated <Formula “m” formula="@l + 10" value=22.00>

Updated <Formula “h” formula="@g / @i" value=2.00>

Updated <Formula “g” formula="@h * @i" value=20.00>

Initial values…

<Cell “a” value=5.00>

<Formula “b” formula="@d + @e + @a" value=40.00>

<Formula “c” formula="@a + 3" value=8.00>

<Cell “d” value=3.00>

<Cell “e” value=32.00>

<Formula “f” formula="@b + @c" value=48.00>

<Formula “g” formula="@h * @i" value=20.00>

<Formula “h” formula="@g / @i" value=2.00>

<Cell “i” value=10.00>

<Formula “j” formula="@c + 8" value=16.00>

<Formula “k” formula="@l + @j" value=28.00>

<Cell “l” value=12.00>

<Formula “m” formula="@l + 10" value=22.00>

Changing ‘a’ to 7…

Updated <Formula “b” formula="@d + @e + @a" value=42.00>

Updated <Formula “f” formula="@b + @c" value=50.00>

Updated <Formula “c” formula="@a + 3" value=10.00>

Updated <Formula “f” formula="@b + @c" value=52.00>

Updated <Formula “j” formula="@c + 8" value=18.00>

Updated <Formula “k” formula="@l + @j" value=30.00>

## Changing ‘i’ to 36…

Updated <Formula “h” formula="@g / @i" value=0.56>

Updated <Formula “g” formula="@h * @i" value=20.00>

Updated <Formula “g” formula="@h * @i" value=20.00>

Updated <Formula “h” formula="@g / @i" value=0.56>

Per cell dependency…

c -> f, j

e -> b

f ->

i -> h, g

l -> k, m

j -> k

b -> f

k ->

g -> h

m ->

a -> b, c

h -> g

d -> b