More efficient date range comparison


#1

Hello all

I’m writing a calendar application that needs to check for conflicts
between recurring entries. Each Entry object has a recurrences() method
which returns an array of ranges - each range contains the start and end
times of each future occurence.

I need to check for conflicts between new and existing entries. I’m
doing this by checking that none of the future occurences of the new
entry clash with future occurences of existing entries:

def conflicts?(other)
conflicts = 0
recurrences.each do |my_rec|
other.recurrences.each do |other_rec|
start, finish = other_rec.first, other_rec.last
conflicts += 1 if my_rec.include?(start) ||
my_rec.include?(finish)
end
end
conflicts > 0
end

recurrences() defaults to returning all occurences between start time
and start time + 1 year

the problem is this method is not the most efficient in the world.
comparing just two entries, each with a daily recurrence over 1 year,
leads to 365 * 365 comparions (on my machine that takes 4+ seconds).
There may be any number of existing entries to compare a new entry to so
the method I have now is useless.

I don’t have a computer science or maths background but I’ve been
reading various textbooks on algorithms and I havn’t been able to find a
way to optimize the method. Does anyone else have any ideas?

thanks
Dave


#2

First of all, you could return true as soon as you find a conflict
instead of counting them. However, it does not change anything from a
theoretical point of view: your method executes in O(nm) time which
means that if recurrences has n entries and other.recurrences has m
entries, then the time spent in the method is a function of n
m in the
worst case: when no conflict arrises your method checks all pairs.

If I understand your need correctly (I’m not sure to understand what
an occurence is exactly), it seems that you could write an algorithm
that executes in O(m+n), which is really better. By sorting your
ranges, you could avoid many useless comparisons. It reminds me a
couple of algorithms that I’ve written a few month ago and which
computes the union/intersection of composite intervals. You can have a
look at code.chefbe.net: there’s a project called CRUC with a well
documented class named Interval (rdoc is online) which contains these
algorithms.

I hope this helps (let me know)
blambeau


#3

Dave W.worth wrote:

  conflicts += 1 if my_rec.include?(start) ||

my_rec.include?(finish)

Off topic: I think you have a bug here. If my_rec is, say, 20…40, and
other_rec is, say, 10…50, your test won’t detect this conflict (because
neither 10 nor 50 are included in the my_rec range).


#4

Oops, you are perfectly right … sorry!

blambeau


#5

It’s a OR, not a AND, so it seems correct … no? Seems to be the
classical overlap operator on simple intervals.

blambeau


#6

Albert S. wrote:

Dave W.worth wrote:

  conflicts += 1 if my_rec.include?(start) ||

my_rec.include?(finish)

Off topic: I think you have a bug here. If my_rec is, say, 20…40, and
other_rec is, say, 10…50, your test won’t detect this conflict (because
neither 10 nor 50 are included in the my_rec range).

good catch! thanks


#7

LAMBEAU Bernard wrote:

Extracted from the CRUC project I mentionned (see the intersection
method). I think the code below should make the job. Even if your
range lists are not initially sorted, this algorithm should be better
than the initial one: O(nlogn+mlogm+n+m) in the worst case.

Assume you have two lists of ranges, each one being sorted by start

point
my_ranges, other_ranges = …

Take the first two, as well as their extremities

r1, r2 = my_ranges.shift, other_ranges.shift
b1, e1 = r1.begin, r1.end
b2, e2 = r2.begin, r2.end

until (my_ranges.empty? or other_ranges.empty?)
if e1<b2
# my end is before its begin, we don’t overlap
r1 = my_points.shift
b1, e1 = r1.begin, r1.end
elsif
# its end is before my begin, we don’t overlap
r2 = other_points.shift
b2, e2 = r2.begin, r2.end
else
# intersection found, it’s a conflict
return true
end
end

All ranges are disjoint, no conflict

return false

blambeau

thanks for your suggestions. I’m using a slightly modified version of
the code you posted and it’s about 8x faster. I’m sure it could be
optimized further but it’s good enough for now. thanks again!

dave


#8

Extracted from the CRUC project I mentionned (see the intersection
method). I think the code below should make the job. Even if your
range lists are not initially sorted, this algorithm should be better
than the initial one: O(nlogn+mlogm+n+m) in the worst case.

Assume you have two lists of ranges, each one being sorted by start

point
my_ranges, other_ranges = …

Take the first two, as well as their extremities

r1, r2 = my_ranges.shift, other_ranges.shift
b1, e1 = r1.begin, r1.end
b2, e2 = r2.begin, r2.end

until (my_ranges.empty? or other_ranges.empty?)
if e1<b2
# my end is before its begin, we don’t overlap
r1 = my_points.shift
b1, e1 = r1.begin, r1.end
elsif
# its end is before my begin, we don’t overlap
r2 = other_points.shift
b2, e2 = r2.begin, r2.end
else
# intersection found, it’s a conflict
return true
end
end

All ranges are disjoint, no conflict

return false

blambeau