On Mon, Aug 28, 2006 at 10:52:42PM -0400, Simetrical wrote:
On 8/28/06, Jay R. Ashworth <jra(a)baylink.com>
wrote:
Because in wikitext, everything is in-band; in
XML, the structure is
out-of-band, on purpose. This requires an entirely different, and I
suspect, much more complicated diff algorithm.
I don't know what "in-band" and "out-of-band" mean ([[Out of
band]]
doesn't help either),
The current diff engine, with which I'm not familiar intimately (read
that as I haven't looked at the code at all, but I'm assuming it's
somewhat familiar with the Unix diff internals) is working on one big
object of stream text. The structural markup is *part* of that stream
of text, hence, in-band.
but if the diff engine parses
the XML, it can
look for a) changes in structure/markup and b) changes in content.
Yep, and those will interact in ways different from the ways that they
do now: the current diff engine need not "trip over" the edges of
objects in the way that an XML parser will have to.
Either one should be very easy and fast to diff, given
XML-parsing
library functions (for the C++ module used on WMF sites, that is).
Faster than present, I don't know, but the present differ is hardly a
bottleneck.
Certainly. I wasn't suggesting that it was; rather, the opposite.
Anyone got any implementation experience with diffing XML trees?
Cheers,
-- jra
--
Jay R. Ashworth jra(a)baylink.com
Designer Baylink RFC 2100
Ashworth & Associates The Things I Think '87 e24
St Petersburg FL USA
http://baylink.pitas.com +1 727 647 1274
The Internet: We paved paradise, and put up a snarking lot.