Lorax 0.2.0 Released

lorax version 0.2.0 has been released!

The Lorax is a full diff and patch library for XML/HTML documents, based
on
Nokogiri.

It can tell you whether two XML/HTML documents are identical, or if
they’re not, tell you what’s different. In trivial cases, it can even
apply the patch.

It’s based loosely on Gregory Cobena’s master’s thesis paper, which
generates deltas in less than O(n * log n) time, accepting some
tradeoffs in the size of the delta set. You can find his paper at
Gregory Cobena's PhD Thesis.

“I am the Lorax, I speak for the trees.”

Changes:

0.2.0 (2010-10-14)

  • Better handling of whitespace: blank text nodes are ignored, as is
    leading and trailing whitespace in text nodes. GH#2.

== Features / Problems

  • Detect differences between documents, or tell whether two documents
    are
    the same.
  • Generate patches for the differences between documents.
  • Apply patches for trivial cases.
  • More work needs to be done to make sure patches apply cleanly.

== Synopsis

Imagine you have two Nokogiri::XML::Documents. You can tell if they’re
identical:

Lorax::Signature.new(doc1.root).signature ==

Lorax::Signature.new(doc2.root).signature

You can generate a delta set (currently opaque (sorry kids)):

delta_set = Lorax.diff(doc1, doc2)

and apply the delta set as a patch to the original document:

new_doc   = delta_set.apply(doc1)