Forum: Ruby more efficient date range comparison

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Dave W. (Guest)
on 2009-04-12 16:11
Hello all

I'm writing a calendar application that needs to check for conflicts
between recurring entries. Each Entry object has a recurrences() method
which returns an array of ranges - each range contains the start and end
times of each future occurence.

I need to check for conflicts between new and existing entries. I'm
doing this by checking that none of the future occurences of the new
entry clash with future occurences of existing entries:

def conflicts?(other)
  conflicts = 0
  recurrences.each do |my_rec|
    other.recurrences.each do |other_rec|
      start, finish = other_rec.first, other_rec.last
      conflicts += 1 if my_rec.include?(start) ||
my_rec.include?(finish)
    end
  end
  conflicts > 0
end

recurrences() defaults to returning all occurences between start time
and start time + 1 year

the problem is this method is not the most efficient in the world.
comparing just two entries, each with a daily recurrence over 1 year,
leads to 365 * 365 comparions (on my machine that takes 4+ seconds).
There may be any number of existing entries to compare a new entry to so
the method I have now is useless.

I don't have a computer science or maths background but I've been
reading various textbooks on algorithms and I havn't been able to find a
way to optimize the method. Does anyone else have any ideas?

thanks
Dave
LAMBEAU Bernard (Guest)
on 2009-04-12 21:45
(Received via mailing list)
First of all, you could return true as soon as you find a conflict
instead of counting them. However, it does not change anything from a
theoretical point of view: your method executes in O(n*m) time which
means that if recurrences has n entries and other.recurrences has m
entries, then the time spent in the method is a function of n*m in the
worst case: when no conflict arrises your method checks all pairs.

If I understand your need correctly (I'm not sure to understand what
an occurence is exactly), it seems that you could write an algorithm
that executes in O(m+n), which is really better. By sorting your
ranges, you could avoid many useless comparisons. It reminds me a
couple of algorithms that I've written a few month ago and which
computes the union/intersection of composite intervals. You can have a
look at code.chefbe.net: there's a project called CRUC with a well
documented class named Interval (rdoc is online) which contains these
algorithms.

I hope this helps (let me know)
blambeau
Albert S. (Guest)
on 2009-04-13 10:57
Dave W.worth wrote:
>       conflicts += 1 if my_rec.include?(start) ||
> my_rec.include?(finish)

Off topic: I think you have a bug here. If my_rec is, say, 20..40, and
other_rec is, say, 10..50, your test won't detect this conflict (because
neither 10 nor 50 are included in the my_rec range).
LAMBEAU Bernard (Guest)
on 2009-04-13 11:17
(Received via mailing list)
It's a OR, not a AND, so it seems correct ... no? Seems to be the
classical overlap operator on simple intervals.

blambeau
LAMBEAU Bernard (Guest)
on 2009-04-13 11:22
(Received via mailing list)
Oops, you are perfectly right ... sorry!

blambeau
LAMBEAU Bernard (Guest)
on 2009-04-13 11:37
(Received via mailing list)
Extracted from the CRUC project I mentionned (see the intersection
method). I think the code below should make the job. Even if your
range lists are not initially sorted, this algorithm should be better
than the initial one: O(nlogn+mlogm+n+m) in the worst case.

# Assume you have two lists of ranges, each one being sorted by start
point
my_ranges, other_ranges = ...

# Take the first two, as well as their extremities
r1, r2 = my_ranges.shift, other_ranges.shift
b1, e1 = r1.begin, r1.end
b2, e2 = r2.begin, r2.end

until (my_ranges.empty? or other_ranges.empty?)
  if e1<b2
    # my end is before its begin, we don't overlap
    r1 = my_points.shift
    b1, e1 = r1.begin, r1.end
  elsif
    # its end is before my begin, we don't overlap
    r2 = other_points.shift
    b2, e2 = r2.begin, r2.end
  else
    # intersection found, it's a conflict
    return true
  end
end

# All ranges are disjoint, no conflict
return false



blambeau
Dave W. (Guest)
on 2009-04-16 15:51
Albert S. wrote:
> Dave W.worth wrote:
>>       conflicts += 1 if my_rec.include?(start) ||
>> my_rec.include?(finish)
>
> Off topic: I think you have a bug here. If my_rec is, say, 20..40, and
> other_rec is, say, 10..50, your test won't detect this conflict (because
> neither 10 nor 50 are included in the my_rec range).

good catch! thanks
Dave W. (Guest)
on 2009-04-16 15:54
LAMBEAU Bernard wrote:
> Extracted from the CRUC project I mentionned (see the intersection
> method). I think the code below should make the job. Even if your
> range lists are not initially sorted, this algorithm should be better
> than the initial one: O(nlogn+mlogm+n+m) in the worst case.
>
> # Assume you have two lists of ranges, each one being sorted by start
> point
> my_ranges, other_ranges = ...
>
> # Take the first two, as well as their extremities
> r1, r2 = my_ranges.shift, other_ranges.shift
> b1, e1 = r1.begin, r1.end
> b2, e2 = r2.begin, r2.end
>
> until (my_ranges.empty? or other_ranges.empty?)
>   if e1<b2
>     # my end is before its begin, we don't overlap
>     r1 = my_points.shift
>     b1, e1 = r1.begin, r1.end
>   elsif
>     # its end is before my begin, we don't overlap
>     r2 = other_points.shift
>     b2, e2 = r2.begin, r2.end
>   else
>     # intersection found, it's a conflict
>     return true
>   end
> end
>
> # All ranges are disjoint, no conflict
> return false
>
>
>
> blambeau

thanks for your suggestions. I'm using a slightly modified version of
the code you posted and it's about 8x faster. I'm sure it could be
optimized further but it's good enough for now. thanks again!

dave
This topic is locked and can not be replied to.