It seems to find no vandalism earlier than February 2004 in any of the
very high visibility articles I checked. Was there a change in
Wikipedia custom and/or rollback functionality at that time, or is
something going wrong with your counter?
In any case, good work.
SCZenz
On 12/24/05, Tony Sidaway <f.crdfa(a)gmail.com> wrote:
My vandalism analysis tool, which uses a simple but
powerful
methodology developed by Brian0918, analyses edit summaries on
articles to spot probable vandalism reverts by recognising the summary
patterns of standard rollbacks, and edits labelled "rvv", "rv v" or
"rvc". It was developed for English Wikipedia but probably has
applications beyond that, and the methods developed here have obvious
utility beyond the recognition and reporting of vandalism.
You can visit it here:
http://tools.wikimedia.de/~tony_sidaway/
Please try to break it, and tell me what happened. There is a link to
a discussion page for that purpose.
The rationale is that, while vandalism is difficult to recognise
electronically, a pretty easy and reasonably reliable way to track
vandalism on a popular wiki article is to examine edit summaries and
count the proportion of them that indicate that the editors apparently
believed themselves to be reverting vandalism.
A highly experimental adaptation of this script to recognise (only)
rollbacks on the German Wikipedia is here:
http://tools.wikimedia.de/~tony_sidaway/cgi-bin/vandalismus.cgi
The text of the latter CGI script is currently in English, although it
is analyzing German text. As I know nothing about
internationalization I have no idea whether it will always perform
correctly if UTF-8 multibyte characters (such as o-umlaut) are
entered.
This simple test seems to suggest that it does work:
http://tools.wikimedia.de/~tony_sidaway/cgi-bin/vandalismus.cgi?article=Köln
Wikipedia is an international project and I welcome any and all
testing input on this.
Presently I don't know of any edit summary patterns that
non-administrators on the German Wikipedia use to indicate that
they're reverting what we on English Wikipedia would recognise as
simple vandalism--as I'm unfamiliar with their practises I'm not even
certain that they draw the same distinctions that we do on English
Wikipedia between intentional and overt disruptive edits (simple
vandalism) and more subtle vandalism or trolling.
Any help on this that German speakers can offer would be most welcome.
Although I address the German Wikipedia prominently because its
community is highly advanced and well organized, its content
comparable to that of the English Wikipedia, and (not least) Deutsche
Wikipedia hosts the tool server, I would also love to produce useful
tools for as many languages as possible--the skills I learn can be put
to use in tools of more general use than the current one. The scripts
I write can easily be internationalized. I cannot write good German
(whenever I try, native German speakers beg me to stop!) but I can
write good French and reasonable Spanish. I am particularly
interested in Chinese, Indian languages, and Russian.
_______________________________________________
WikiEN-l mailing list
WikiEN-l(a)Wikipedia.org
To unsubscribe from this mailing list, visit:
http://mail.wikipedia.org/mailman/listinfo/wikien-l