Many projects have installed a “popular pages” tool highlighting which of the pages with the talk page banner are most popular. It is updated monthly (ish) see for example https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Somerset/Popular_pages  so the toolserver tool at https://toolserver.org/~alexz/pop/ may also be useful.

 

On a more general point can I just ask why an automated tool (using all the suggested parameters) is likely to be any more accurate that the human generated wikiproject rankings?

 

Rod

 

From: wikimediauk-l-bounces@lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of Edward Saperia
Sent: 17 April 2014 14:48
To: UK Wikimedia mailing list
Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing)

 

It's interesting to think that in most circumstances, good online content is considered to drive traffic, i.e. quality pages attract more views, but with Wikipedia articles, I've only ever seen people think high traffic articles => more editors => higher quality. This is intuitive, but it would be interesting to see how true it is. It would also be interesting to see what percentage of readers are editors by topic area; I suspect this would vary a lot.

 

I always find it a bit of a shame that viewership figures are hidden away in an unpublicised tool (https://tools.wmflabs.org/wikiviewstats/). I would have though seeing how many people view a page would be very motivating to editors, and perhaps could be displayed prominently e.g. on talk pages.


Edward Saperia

Chief Coordinator Wikimania London

email • facebook • twitter • 07796955572

133-135 Bethnal Green Road, E2 7DG

 

On 17 April 2014 14:34, Simon Knight <sjgknight@gmail.com> wrote:

I think we’d want to distinguish between:

·         Quality – taken from diff-features (i.e. Writing, but possibly including Sources), and

·         Significance – taken from Traffic, Edit History, and Discussion

 

The latter might be used to give a weighting to the high-significance ratings such that high quality edits on highly significant articles are rated higher, but that’s a secondary question.

 

You’re right that then there’s an interesting issue re: what’s output, and how this is used. In this case our primary interest is in getting a feel for what level of quality the quantity of WMUK related edits are. One can easily imagine that being used by other chapters/orgs within the movement, but it could of course also be spread outside of article writing (e.g. any education assignment on a wiki) or/and be used by projects to explore article qualities. I would guess that outputting discrete scores for various things (e.g. ‘referencing’, ‘organisation’, etc.) and providing some means to amalgamate those scores for an overview would be more useful than just a raw ‘score’/rating. I’ll think about this a bit more over the weekend I hope.

 

Cheers

S

 

From: wikimediauk-l-bounces@lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of John Byrne
Sent: 17 April 2014 13:46
To: wikimediauk-l@lists.wikimedia.org
Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing)

 

I must say I'm pretty dubious about this approach for articles.  I doubt it can detect most of the typical problems with them - for example all-online sources are
very often a warning sign, but may not be, or may be inevitable in a topical subject.  Most of Charles' factors below relate better to views and controversialness than article quality, and
article quality has a limited ability to increase views, as study of FAs before and after expansion will show.

Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with
tweaking" in most subjects.  Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits.  Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad  articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text.

There is also the question of what use the results of the exercise will be.  Our current quality ratings certainly have problems, but are a lot better than nothing.  However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings,  are relatively few.   An automated system is hard to argue with, & I'm concerned that such ratings will actually
cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them.   One issue with the manual system is that it tends to give a greatly excessive
weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't.   It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland and Wikiproject:Science.  

John

Date: Wed, 16 Apr 2014 19:53:20 +0100
From: Charles Matthews <charles.r.matthews@ntlworld.com>
 
 
There's the old DREWS acronym from How Wikipedia Works. to which I'd now
add T for traffic. In other words there are six factors that an experienced
human would use to analyse quality, looking in particular for warning signs.
 
D = Discussion: crunch the talk page (20 archives = controversial, while no
comments indicates possible neglect)
R = WikiProject rating, FWIW, if there is one.
E = Edit history. A single editor, or essentially only one editor with
tweaking, is a warning sign. (Though not if it is me, obviously)
W = Writing. This would take some sort of text analysis. Work to do here.
Includes detection of non-standard format, which would suggest neglect by
experienced editors.
S = Sources. Count footnotes and so on.
T = Traffic. Pages at 100 hits per month are not getting many eyeballs.
Warning sign. Very high traffic is another issue.
 
Seems to me that there is enough to bite on, here.
 
Charles
 
 
 

 


Image removed by sender.

This email is free from viruses and malware because avast! Antivirus protection is active.

 


_______________________________________________
Wikimedia UK mailing list
wikimediauk-l@wikimedia.org
http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l
WMUK: https://wikimedia.org.uk