Am 20.02.2006 um 11:16 schrieb David S.:
done and started to get thoroughly confused.
The first thing I see is that you are dealing with very complicated
data. In math-speak you are dealing with “bags” of intervals (a bag
being a “multiset” which can contain the same element more than
once). Intervals are complicated but that you might have multiple,
identical ones does not make things simpler.
Let’s only consider a single package. Your data looks like this when
abstracted
[2006-01-01, 2006-02-23]
[2006-02-12, 2006-03-23]
[2006-03-23, 2006-04-23]
[2006-01-13, 2006-06-23]
Let’s say these are four rooms or slots for this package in your
booking system.
Now, if you get a booking - say of the second interval, you might end
up with data like the following (assuming someone books from
2006-02-14 to 2006-03-04):
[2006-01-01, 2006-02-23]
[2006-02-12, 2006-02-13], [2006-03-05, 2006-03-23]
[2006-03-23, 2006-04-23]
[2006-01-13, 2006-06-23]
As we can see, you are getting more intervals - d’oh.
You can imagine how much more complicated all this will get with
multiple packages and of course with more slots (I assume there are
more than hundred concurrent bookings for a package for medium sized
hotels).
As far as I can see, you can approach storing the general data in
four ways.
(I) First, you can store intervals, but since there might be more
than one identical interval, this might end up ugly (storing the same
thing twice is a nono in relational databases). You’d get something
like this:
interval: (from, to, count)
Now, if person A books an interval in the middle of the other
interval, you might end up with the following intervals:
1.) the interval before the booking with count = old_count
2.) the interval after the booking with count = old_count
3.) the interval with the same interval of the boking with count =
old_count-1
Whee, and if person B books the interval in 1., you will have to
merge both intervals back together to keep consistency.
(II) So storing the free slots seems not to be the best way to do it.
What about storing the booked intervals? We might end up with the
following:
package := (id, maximum_bookings)
booking := (id, booking, interval)
So when checking if we have free slots at a given time T, we do
something like (in Pseudo-SQL):
SELECT COUNT(id)
FROM booking
WHERE booking.interval CONTAINS %T%
This seems pretty clean to me. You might want to extend this with a
“slots_booked” column in booking and replace the COUNT(id) with a SUM
(slots_booked).
(III, IV) Since the intervals have a finite count of elements you
could also store the days (or weeks if you can only book full weeks
in your system) and make them have a “availability” count as in (I)
or store only the booked days as in (II).
Since your database will kind of explode (356 days * n slots per
package * m packages), I’d recommend (II). I don’t know how fast such
“date in interval” queries are in your database system. I’d implement
all that stuff and if it turns out that this query is a bottle neck,
you might want to have a “cache” of sorts rebuilt by stored
procedures (or in your code) on storing elements. Of course, this is
kind of a hack and maybe your RDBMS handles those interval questions
really well with an index.
Regards,
Manuel