Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality

25 Nov 2008

I agree with Gregory that it is very useful to quantify the usefulness of
trust information on text -- otherwise, all comparison are very subjective.
In our WikiSym 08 paper, we measure various parameters of the "trust"
coloring we compute, including:

   - Recall of deletions.  Only 3.4% of text is in the lower half of trust
   values, yet this is 66% of the text that is deleted in the very next
   revision.
   - Precision of deletions.  Text is the bottom half of trust values has
   probability 33% of being deleted in the next revision, agaist a probability
   of 1.9% for general text.  The deletion probability raises to 62% for text
   in the bottom 20% of trust values.
   - We study the correlation between the trust of a word, sampled at random
   in all revisions, and the future lifespan of a word (correcting for the
   finite horizon effect due to the finite number of revisions in each
   article), showing positive correlation.

Some aspects are not captured by the above measures:

   - We ensured that every "tampering" (including cut-and-paste) are
   reflected in the trust coloring, so it is hard to subvert the algorithm
   (does "age" provide this?).
   - We ensured the whole scheme is robust wrt attacks (see the various
   papers if you are interested).

I fully believe that it should not be hard to improve on our system re. the
above measurements.  And I fully agree that the "reputation" we compute is
essentially an internal parameter of the system, and does not really
constitute a good summary of a person's overall Wikipedia contribution; for
this and other reasons we do not display it.

Luca

A simply objective challenge for any predictive coloring system would
...
  be to use them in the following experimental
procedure:

 * Take a dump of Wikipedia up a year old, use this as the underlying
 knowledge for the systems.
 * Make several random selections of articles and include the newer
 revisions not included in the initial set up to 6 months old. Call
 these the test sets.
 * The predictive coloring system should then take each revision in a
 test set in time order and predict if it will be reverted (Within X
 time?).
 * The actual edits up to now should be analyzed to determined which
 changes actually were reverted and when.

 The final score will be the false positive and false negative rates.
 So long as e assume that the existing editing practices are not too
 bad we should find that the best predictive coloring system would
 generally tend to minimize these rates.
 _______________________________________________
 Wikipedia-l mailing list
 Wikipedia-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikipedia-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [Wikipedia-l] Study on Interfaces to Improving Wikipedia Quality