One thing that stood out for me in the small sample of articles I
examined was the flagging of innocuous changes by casual users to
correct spelling, grammar, etc. Thus a "nice-to-have" would be a
"smoothing" algorithm that ignores inconsequential changes such as
spelling corrections, etc. or the reordering of semantically-contained
units of text (for example, reordering the line items in a list w/o
changing the content of any particular line item, etc., or the
reordering of paragraphs and perhaps even sentences.) I think this
would cover 90% or more of changes that are immaterial to an article's
credibility.
A happy new year to everybody. Thanks to Aaron and Tim, the code is
now fit for large-scale applications. The next step (large-scale
betatest or immediate deployment) is now under discussion, I'll keep
you in the loop.
The current version of the extension using a set of parameters that
shows most features, but will not be used in this form, can be found
on test.wikipedia.org and you can give yourself rights by logging in
and going to http://test.wikipedia.org/wiki/Special:Userrights.
If you find any bugs or points you want to discuss, right now would be
a good time :-)
Best,
Philipp