Article version validation and import (was Re: [Foundation-l] Most read US newpaper blasts Wikipedia)

Magnus Manske magnus.manske at web.de
Wed Nov 30 22:17:46 UTC 2005


Daniel Mayer wrote:
> --- Magnus Manske <magnus.manske at web.de> wrote:
>   
>> The validation system will, no doubt, suffer from two "flaws" in the
>> regard of offering reliability:
>> 1. Anyone (at least, anyone with a username, if we turn off anons) can
>> "validate"
>>     
>
> Reads outnumber edits at least 200 to 1. Thus there is a HUGE potential
> resource of readers we can draw on to validate articles. I therefore think that
> when/if this feature goes live it should allow anon validation. But validators
> should also be able to rate the ratings of others (ala 'did you find this
> rating useful'). I also assume that comments will be collected. If problems
> arise, we could implement a trust matrix system for validators (anons could be
> nothing more than lowest rated though; their only effect would be in numbers). 
>   
There's currently no "was this rating useful" feature. Brion is already
worried about everyone voting for every revision of every page on a
dozen or so topics bogging down the site; now if everyone can also rate
every rating, we're in exponential hell ;-)

We'll have some more solid ground on this after we test-ran the
validation system (with anons) for some time. "Trust matrix" sounds
eerily like "Karma points", though :-(
>   
>> 2. Validations will have to be interpreted to simplify them to a
>> good/suspicious/bad rating
>>     
>
> A simple star system for a few different areas:
> 1) Completeness
> 2) Accuracy  
> 3) Readability
>   
That's what the user should see in the end; but:
http://meta.wikimedia.org/wiki/En_validation_topics
Try to automatically boil *that* down to a few stars. Problems I can see
at a glance:
* Are older revisions of an article considered for calculation the
"stars" of the current one? If yes, it will falsify the result for
vanalized pages. If no, every minor edit will reset the validation
counter to zero for the current version, until a few people vote for
that revision again.
* For a range of 1-5 points, are a "1" and a "5" vote the same as two
"3" votes? If yes, it hides an apparent controversy; if no, it opens the
door for vandal votes getting an unusually weigth.
And so on.
>   
>> There is a radical alternative, which I have begun to code a few weeks
>> ago. It alters a MediaWiki installation to "import-only", replacing
>> editing with an import function for an article version from wikipedia.
>> As the imported articles are not editable at all, they do not represent
>> a fork, merely a static wikipedia snapshot, alas per article and not for
>> the whole wikipedia. Such a system would allow imports only for
>> logged-in users, and be invite-only.
>>     
>
> Logged-in users should only be able to import the highest-rated version of
> articles that have at least x number of validations. That would negate the need
> to create a new user class. But if/when that is abused, then we may need to use
> a trust matrix system for users and only allow trusted users to import
> validated article versions. A hack would be to add a new user class and an
> admin-like community approval process. But I don't think that will scale fast
> enough.
>   
That would make it dependent of the validation system (which I still
want to see, don't get me wrong:-)
The whole point of my proposal is to separate the new "user class" from
wikipedia entirely; it will be a separate project, but based on
wikipedia. The cathedral filtering the bazaar. The best of both worlds,
hopefully. If this turns out to be a flase hope, it can be trashed
easily enough without side effects for wikipedia.
>   
>> Individuals could then chose which "issue" to read, and mirrors could
>> decide if they want to go for "slow quality" or "fast unreliability"...
>>     
>
> Heck - why not just automatically add a prominent link at the top of each page
> that says 'Read the highest-rated version of this article' and mark those
> versions in the database so mirrors can choose to just display those versions?
> Then there would be no need for manual import. But it still may be a good idea
> to have trusted humans doing final reviews of reader-validated content. 
>
> Either way works for me so long as the most up-to-date version of articles are
> displayed by default (as is now the case). Logged-in users should be able to
> change their preferences so they only see the highest rated version of articles
> if available. 
>   
We should definitely try something along those lines inside wikipedia.
All I'm saying is that the possible gain from a separate project is
high, the cost low, and a faliure will not affect Wikipedia in any way.
>   
>> Yes, a few people (compared to Wikipedia editors) will take a long time
>> to check/fix/import all Wikipedia articles. Also, the imported versions
>> will soon be outdated compared to Wikipedia. So what? This site will be
>> for reliability; Wikipedia is for development and current events coverage.
>>     
>
> A validation feature could feed an import queue: Article versions that reach a
> certain rating threshold could go into an RC-like list. Then a group of
> logged-in users check the queue and give the final go ahead for that article
> version to be marked as the 'Highest rated version' for that article.
>   
That could be a nice tool.For "importers", there could also be a display
on each imported article how many versions have "passed" since the last
import, and maybe the current validation rating, compared to the one of
the imported version.
Which leaves the question of how to determine a high-rating article.
>   
>> I would see such a site working in parallel to the validation feature.
>> Some might argue that this would "split out forces", with some people
>> validating and some importing. OTOH, a little friendly compedition might
>> do good for motivation.
>>     
>
> Readers validate and editors import. I don't see how that is splitting out
> forces when readers do so little as is.
>   
There's probably an overlap of wikipedians (in contrast to "mere"
readers) who might be very active in either function; that's the source
we'd be splitting.

The average reader who occasionally rates an article as good or bad
won't do any importing, that's right.

Magnus



More information about the foundation-l mailing list