Oren S. wrote in post #1003858:
Sometime tough the conflict resolution actions may be different than the
normal update actions and then you should still use save_and_if_conflict
Logically, I don’t see any good reason why the actions should be
different, at least if you are only updating one document.
Consider the two cases:
(A) No conflict
---->time
Client 1 *read update *write
Client 2 *read update *write
(B) Conflict
---->time
Client 1 *read update *write
Client 2 *read update *write
The only difference is that in B, client 2 is reading slightly earlier
(just before client 1 had written). If it happened to read a little bit
later, there would have been no conflict, and it would have done the
update just fine.
Therefore, this is a simple race, and you should code so that you get
the same outcome regardless of the race winner. If you didn’t, then your
application would behave non-deterministically.
However, if you are updating multiple documents together, then it’s a
different case. CouchDB provides no concept of “transaction”; the best
it offers you is a POST to _bulk_docs with “all_or_nothing”:true, which
guarantees that either all of the updates will be written to the
database, or none of them. However this mode also doesn’t perform any
conflict detection, i.e. you will get conflicting updates just the same
as if they had replicated in.
I should say at this point that I think CouchDB is an excellent piece of
software, and its incremental map-reduce and incremental replication is
truly awesome. I just think they made a mistake in attempting to hide,
rather than embrace, conflicting updates. The model described by the
Amazon Dynamo paper takes the opposite view: “write always succeeds”.
Whenever you read a document, you see all the current conflicting
versions, which forces the reader to resolve conflicts; and whenever you
write a document, you list all the parent(s) you are superceding.
So I think CouchDB’s API should have embraced this too.
- PUT should alway succeed (even if it generates a conflict)
- GET a document should always return an array of all live versions
of the document, not a single arbitrary version. Ditto for bulk fetch
and views.
- PUT a document should specify a list of the revs it is superceding,
not a single rev.
Emulating this behaviour is hard. Well, it’s not too tricky to get (1)
if you use POST to _bulk_docs with “all_or_nothing”:true, and you can
hide this in client-side API.
However (2) and (3) are a real pain and inefficient; you have to
explicitly issue multiple GETs to get the conflicting revs and the
conflicting versions (especially when processsing a view); and when you
resolve a conflict, you have to replace one rev then explicitly delete
the other revs.
The current HTTP API strongly encourages you to ignore conflicts and
cross your fingers, which of course means applications are built that
way, which means they won’t scale to multi-master environments or
replication with off-line updates. What users will see is that changes
made on one side or the other are simply “lost” - they actually exist in
the database, but the frontend doesn’t let them see them.
BTW there is another CouchDB feature you could look at: the _update
handler. Here you can write some Javascript to perform an update of an
existing document, rather than the client having to POST the complete
object back again. This gives you something of a “model” layer you can
use, although it’s limited to updating a single document at a time.
http://wiki.apache.org/couchdb/Document_Update_Handlers
Regards,
Brian.