Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing)

List overview All Threads
Download

newer

older

Re: [Wikimediauk-l] [Wikimedia-l]...

Fwd: [Wikimedia-l] Rating...

John Byrne

17 Apr 2014 17 Apr '14

12:46 p.m.

I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show. Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text. There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland <https://en.wikipedia.org/wiki/WikiProject_Friesland> and Wikiproject:Science. John

...

Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews <charles.r.matthews(a)ntlworld.com> There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs. D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue. Seems to me that there is enough to bite on, here. Charles

--- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com

Attachments:

attachment.htm (text/html — 4.6 KB)

Show replies by date

Simon Knight

17 Apr 17 Apr

1:34 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

I think we’d want to distinguish between: · Quality – taken from diff-features (i.e. Writing, but possibly including Sources), and · Significance – taken from Traffic, Edit History, and Discussion The latter might be used to give a weighting to the high-significance ratings such that high quality edits on highly significant articles are rated higher, but that’s a secondary question. You’re right that then there’s an interesting issue re: what’s output, and how this is used. In this case our primary interest is in getting a feel for what level of quality the quantity of WMUK related edits are. One can easily imagine that being used by other chapters/orgs within the movement, but it could of course also be spread outside of article writing (e.g. any education assignment on a wiki) or/and be used by projects to explore article qualities. I would guess that outputting discrete scores for various things (e.g. ‘referencing’, ‘organisation’, etc.) and providing some means to amalgamate those scores for an overview would be more useful than just a raw ‘score’/rating. I’ll think about this a bit more over the weekend I hope. Cheers S From: wikimediauk-l-bounces(a)lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of John Byrne Sent: 17 April 2014 13:46 To: wikimediauk-l(a)lists.wikimedia.org Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show. Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text. There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland <https://en.wikipedia.org/wiki/WikiProject_Friesland> and Wikiproject:Science. John Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews <mailto:charles.r.matthews@ntlworld.com> <charles.r.matthews(a)ntlworld.com> There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs. D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue. Seems to me that there is enough to bite on, here. Charles _____ <http://www.avast.com/> This email is free from viruses and malware because avast! Antivirus <http://www.avast.com/> protection is active.

Edward Saperia

1:48 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

It's interesting to think that in most circumstances, good online content is considered to drive traffic, i.e. quality pages attract more views, but with Wikipedia articles, I've only ever seen people think high traffic articles => more editors => higher quality. This is intuitive, but it would be interesting to see how true it is. It would also be interesting to see what percentage of readers are editors by topic area; I suspect this would vary a lot. I always find it a bit of a shame that viewership figures are hidden away in an unpublicised tool (https://tools.wmflabs.org/wikiviewstats/). I would have though seeing how many people view a page would be very motivating to editors, and perhaps could be displayed prominently e.g. on talk pages. *Edward Saperia* Chief Coordinator Wikimania London <http://www.wikimanialondon.org> email <ed(a)wikimanialondon.org> • facebook<http://www.facebook.com/edsaperia> • twitter <http://www.twitter.com/edsaperia> • 07796955572 133-135 Bethnal Green Road, E2 7DG On 17 April 2014 14:34, Simon Knight <sjgknight(a)gmail.com> wrote:

...

I think we’d want to distinguish between: · Quality – taken from diff-features (i.e. Writing, but possibly including Sources), and · Significance – taken from Traffic, Edit History, and Discussion The latter might be used to give a weighting to the high-significance ratings such that high quality edits on highly significant articles are rated higher, but that’s a secondary question. You’re right that then there’s an interesting issue re: what’s output, and how this is used. In this case our primary interest is in getting a feel for what level of *quality *the *quantity *of WMUK related edits are. One can easily imagine that being used by other chapters/orgs within the movement, but it could of course also be spread outside of article writing (e.g. any education assignment on a wiki) or/and be used by projects to explore article qualities. I would guess that outputting discrete scores for various things (e.g. ‘referencing’, ‘organisation’, etc.) and providing some means to amalgamate those scores for an overview would be more useful than just a raw ‘score’/rating. I’ll think about this a bit more over the weekend I hope. Cheers S *From:* wikimediauk-l-bounces(a)lists.wikimedia.org [mailto: wikimediauk-l-bounces(a)lists.wikimedia.org] *On Behalf Of *John Byrne *Sent:* 17 April 2014 13:46 *To:* wikimediauk-l(a)lists.wikimedia.org *Subject:* Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show. Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text. There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland<https://en.wikipedia.org/wiki/WikiProject_Frieslan… Wikiproject:Science. John Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews <charles.r.matthews(a)ntlworld.com> <charles.r.matthews(a)ntlworld.com> There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs. D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue. Seems to me that there is enough to bite on, here. Charles ------------------------------ <http://www.avast.com/> This email is free from viruses and malware because avast! Antivirus<http://www.avast.com/>protection is active. _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk

Rod Ward

1:56 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

Many projects have installed a “popular pages” tool highlighting which of the pages with the talk page banner are most popular. It is updated monthly (ish) see for example https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Somerset/Popular_pages so the toolserver tool at https://toolserver.org/~alexz/pop/ may also be useful. On a more general point can I just ask why an automated tool (using all the suggested parameters) is likely to be any more accurate that the human generated wikiproject rankings? Rod From: wikimediauk-l-bounces(a)lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of Edward Saperia Sent: 17 April 2014 14:48 To: UK Wikimedia mailing list Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) It's interesting to think that in most circumstances, good online content is considered to drive traffic, i.e. quality pages attract more views, but with Wikipedia articles, I've only ever seen people think high traffic articles => more editors => higher quality. This is intuitive, but it would be interesting to see how true it is. It would also be interesting to see what percentage of readers are editors by topic area; I suspect this would vary a lot. I always find it a bit of a shame that viewership figures are hidden away in an unpublicised tool (https://tools.wmflabs.org/wikiviewstats/). I would have though seeing how many people view a page would be very motivating to editors, and perhaps could be displayed prominently e.g. on talk pages. Edward Saperia Chief Coordinator Wikimania London <http://www.wikimanialondon.org> email <mailto:ed@wikimanialondon.org> • facebook <http://www.facebook.com/edsaperia> • twitter <http://www.twitter.com/edsaperia> • 07796955572 133-135 Bethnal Green Road, E2 7DG On 17 April 2014 14:34, Simon Knight <sjgknight(a)gmail.com> wrote: I think we’d want to distinguish between: · Quality – taken from diff-features (i.e. Writing, but possibly including Sources), and · Significance – taken from Traffic, Edit History, and Discussion The latter might be used to give a weighting to the high-significance ratings such that high quality edits on highly significant articles are rated higher, but that’s a secondary question. You’re right that then there’s an interesting issue re: what’s output, and how this is used. In this case our primary interest is in getting a feel for what level of quality the quantity of WMUK related edits are. One can easily imagine that being used by other chapters/orgs within the movement, but it could of course also be spread outside of article writing (e.g. any education assignment on a wiki) or/and be used by projects to explore article qualities. I would guess that outputting discrete scores for various things (e.g. ‘referencing’, ‘organisation’, etc.) and providing some means to amalgamate those scores for an overview would be more useful than just a raw ‘score’/rating. I’ll think about this a bit more over the weekend I hope. Cheers S From: wikimediauk-l-bounces(a)lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of John Byrne Sent: 17 April 2014 13:46 To: wikimediauk-l(a)lists.wikimedia.org Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show. Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text. There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland <https://en.wikipedia.org/wiki/WikiProject_Friesland> and Wikiproject:Science. John Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews <mailto:charles.r.matthews@ntlworld.com> <charles.r.matthews(a)ntlworld.com> There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs. D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue. Seems to me that there is enough to bite on, here. Charles _____ <http://www.avast.com/> Image removed by sender. This email is free from viruses and malware because avast! Antivirus <http://www.avast.com/> protection is active. _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk

Simon Knight

2 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

In reply to: On a more general point can I just ask why an automated tool (using all the suggested parameters) is likely to be any more accurate that the human generated wikiproject rankings? It isn’t, but human generated rankings a) aren’t scalable over the larger corpus or across edit counts, or b) intended to be used to analyse at the diff/edit-session level (they’re article level ratings). We’re particularly interested in metrics which give an indication of the quality at the edit/session level. S From: wikimediauk-l-bounces(a)lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of Rod Ward Sent: 17 April 2014 14:56 To: 'UK Wikimedia mailing list' Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) Many projects have installed a “popular pages” tool highlighting which of the pages with the talk page banner are most popular. It is updated monthly (ish) see for example https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Somerset/Popular_pages so the toolserver tool at https://toolserver.org/~alexz/pop/ may also be useful. On a more general point can I just ask why an automated tool (using all the suggested parameters) is likely to be any more accurate that the human generated wikiproject rankings? Rod From: wikimediauk-l-bounces(a)lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of Edward Saperia Sent: 17 April 2014 14:48 To: UK Wikimedia mailing list Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) It's interesting to think that in most circumstances, good online content is considered to drive traffic, i.e. quality pages attract more views, but with Wikipedia articles, I've only ever seen people think high traffic articles => more editors => higher quality. This is intuitive, but it would be interesting to see how true it is. It would also be interesting to see what percentage of readers are editors by topic area; I suspect this would vary a lot. I always find it a bit of a shame that viewership figures are hidden away in an unpublicised tool (https://tools.wmflabs.org/wikiviewstats/). I would have though seeing how many people view a page would be very motivating to editors, and perhaps could be displayed prominently e.g. on talk pages. Edward Saperia Chief Coordinator Wikimania London <http://www.wikimanialondon.org> email <mailto:ed@wikimanialondon.org> • facebook <http://www.facebook.com/edsaperia> • twitter <http://www.twitter.com/edsaperia> • 07796955572 133-135 Bethnal Green Road, E2 7DG On 17 April 2014 14:34, Simon Knight <sjgknight(a)gmail.com> wrote: I think we’d want to distinguish between: · Quality – taken from diff-features (i.e. Writing, but possibly including Sources), and · Significance – taken from Traffic, Edit History, and Discussion The latter might be used to give a weighting to the high-significance ratings such that high quality edits on highly significant articles are rated higher, but that’s a secondary question. You’re right that then there’s an interesting issue re: what’s output, and how this is used. In this case our primary interest is in getting a feel for what level of quality the quantity of WMUK related edits are. One can easily imagine that being used by other chapters/orgs within the movement, but it could of course also be spread outside of article writing (e.g. any education assignment on a wiki) or/and be used by projects to explore article qualities. I would guess that outputting discrete scores for various things (e.g. ‘referencing’, ‘organisation’, etc.) and providing some means to amalgamate those scores for an overview would be more useful than just a raw ‘score’/rating. I’ll think about this a bit more over the weekend I hope. Cheers S From: wikimediauk-l-bounces(a)lists.wikimedia.org [mailto:wikimediauk-l-bounces@lists.wikimedia.org] On Behalf Of John Byrne Sent: 17 April 2014 13:46 To: wikimediauk-l(a)lists.wikimedia.org Subject: Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show. Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text. There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland <https://en.wikipedia.org/wiki/WikiProject_Friesland> and Wikiproject:Science. John Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews <mailto:charles.r.matthews@ntlworld.com> <charles.r.matthews(a)ntlworld.com> There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs. D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue. Seems to me that there is enough to bite on, here. Charles _____ <http://www.avast.com/> Image removed by sender. This email is free from viruses and malware because avast! Antivirus <http://www.avast.com/> protection is active. _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk

Edward Saperia

2:01 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

On 17 April 2014 14:56, Rod Ward <rod(a)rodspace.co.uk> wrote:

...

Very interesting, I hadn't seen this!

...

On a more general point can I just ask why an automated tool (using all the suggested parameters) is likely to be any more accurate that the human generated wikiproject rankings?

I think the idea is that, while it may be less accurate, it can be run reliably on a much larger number of articles more frequently, so you do things like, say, track the daily change in quality for the top 1000 viewed articles. Ed

...

*From:* wikimediauk-l-bounces(a)lists.wikimedia.org [mailto: wikimediauk-l-bounces(a)lists.wikimedia.org] *On Behalf Of *Edward Saperia *Sent:* 17 April 2014 14:48 *To:* UK Wikimedia mailing list *Subject:* Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) It's interesting to think that in most circumstances, good online content is considered to drive traffic, i.e. quality pages attract more views, but with Wikipedia articles, I've only ever seen people think high traffic articles => more editors => higher quality. This is intuitive, but it would be interesting to see how true it is. It would also be interesting to see what percentage of readers are editors by topic area; I suspect this would vary a lot. I always find it a bit of a shame that viewership figures are hidden away in an unpublicised tool (https://tools.wmflabs.org/wikiviewstats/). I would have though seeing how many people view a page would be very motivating to editors, and perhaps could be displayed prominently e.g. on talk pages. *Edward Saperia* Chief Coordinator Wikimania London <http://www.wikimanialondon.org> email <ed(a)wikimanialondon.org> • facebook<http://www.facebook.com/edsaperia> • twitter <http://www.twitter.com/edsaperia> • 07796955572 133-135 Bethnal Green Road, E2 7DG On 17 April 2014 14:34, Simon Knight <sjgknight(a)gmail.com> wrote: I think we’d want to distinguish between: · Quality – taken from diff-features (i.e. Writing, but possibly including Sources), and · Significance – taken from Traffic, Edit History, and Discussion The latter might be used to give a weighting to the high-significance ratings such that high quality edits on highly significant articles are rated higher, but that’s a secondary question. You’re right that then there’s an interesting issue re: what’s output, and how this is used. In this case our primary interest is in getting a feel for what level of *quality *the *quantity *of WMUK related edits are. One can easily imagine that being used by other chapters/orgs within the movement, but it could of course also be spread outside of article writing (e.g. any education assignment on a wiki) or/and be used by projects to explore article qualities. I would guess that outputting discrete scores for various things (e.g. ‘referencing’, ‘organisation’, etc.) and providing some means to amalgamate those scores for an overview would be more useful than just a raw ‘score’/rating. I’ll think about this a bit more over the weekend I hope. Cheers S *From:* wikimediauk-l-bounces(a)lists.wikimedia.org [mailto: wikimediauk-l-bounces(a)lists.wikimedia.org] *On Behalf Of *John Byrne *Sent:* 17 April 2014 13:46 *To:* wikimediauk-l(a)lists.wikimedia.org *Subject:* Re: [Wikimediauk-l] Rating Wikimedia content (was Our next, strategy plan-Paid editing) I must say I'm pretty dubious about this approach for articles. I doubt it can detect most of the typical problems with them - for example all-online sources are very often a warning sign, but may not be, or may be inevitable in a topical subject. Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show. Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits. Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text. There is also the question of what use the results of the exercise will be. Our current quality ratings certainly have problems, but are a lot better than nothing. However the areas where systematic work seems to be going on improving the lowest rated articles, in combination with high importance ratings, are relatively few. An automated system is hard to argue with, & I'm concerned that such ratings will actually cause more problems than they reveal or solve, if people take them more seriously than they deserve, or are unable to over-ride or question them. One issue with the manual system is that it tends to give a greatly excessive weight to article length, as though there was a standard ideal size for all subjects, which of course there isn't. It will be even harder for an automated system to avoid the same pitfall without relying on the very blunt instrument of our importance ratings, which don't pretend to operate to common standards, so that nobody thinks that "high-importance" means, or should mean, the same between say WikiProject_Friesland<https://en.wikipedia.org/wiki/WikiProject_Frieslan… Wikiproject:Science. John Date: Wed, 16 Apr 2014 19:53:20 +0100 From: Charles Matthews <charles.r.matthews(a)ntlworld.com> <charles.r.matthews(a)ntlworld.com> There's the old DREWS acronym from How Wikipedia Works. to which I'd now add T for traffic. In other words there are six factors that an experienced human would use to analyse quality, looking in particular for warning signs. D = Discussion: crunch the talk page (20 archives = controversial, while no comments indicates possible neglect) R = WikiProject rating, FWIW, if there is one. E = Edit history. A single editor, or essentially only one editor with tweaking, is a warning sign. (Though not if it is me, obviously) W = Writing. This would take some sort of text analysis. Work to do here. Includes detection of non-standard format, which would suggest neglect by experienced editors. S = Sources. Count footnotes and so on. T = Traffic. Pages at 100 hits per month are not getting many eyeballs. Warning sign. Very high traffic is another issue. Seems to me that there is enough to bite on, here. Charles ------------------------------ [image: Image removed by sender.] <http://www.avast.com/> This email is free from viruses and malware because avast! Antivirus<http://www.avast.com/>protection is active. _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk

Charles Matthews

3:08 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

On 17 April 2014 14:48, Edward Saperia <ed(a)wikimanialondon.org> wrote:

...

I always find it a bit of a shame that viewership figures are hidden away in an unpublicised tool (https://tools.wmflabs.org/wikiviewstats/). I would have though seeing how many people view a page would be very motivating to editors, and perhaps could be displayed prominently e.g. on talk pages. Actually can be found in the page history:

Edward Saperia

3:39 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

Aha - not very discoverable though! *Edward Saperia* Creative Director Original Content London<http://www.originalcontentlondon.com> email <ed(a)originalcontentlondon.com> • facebook<http://www.facebook.com/edsaperia> • twitter <http://www.twitter.com/edsaperia> • 07796955572 133-135 Bethnal Green Road, E2 7DG On 17 April 2014 16:08, Charles Matthews <charles.r.matthews(a)ntlworld.com>wrote;wrote:

...

On 17 April 2014 14:48, Edward Saperia <ed(a)wikimanialondon.org> wrote:

e.g. on https://en.wikipedia.org/w/index.php?title=Talk:William_Taylor_(man_of_lett… near the top External tools > Page view statistics<http://stats.grok.se/en/latest/Talk:William_Taylor_(man_of_letters)> Charles _______________________________________________ Wikimedia UK mailing list wikimediauk-l(a)wikimedia.org http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l WMUK: https://wikimedia.org.uk

Charles Matthews

3:03 p.m.

New subject: Rating Wikimedia content (was Our next, strategy plan-Paid editing)

On 17 April 2014 13:46, John Byrne <john(a)bodkinprints.co.uk> wrote: <snip>

...

Most of Charles' factors below relate better to views and controversialness than article quality, and article quality has a limited ability to increase views, as study of FAs before and after expansion will show.

I was of course giving the most brief and telegraphic indications, just so as to unpick the acronym. A much fuller discussion is in Chapter 4 of How Wikipedia Works, so I hardly need go into all those things here and now.

...

Personally I'd think the lack of 1 to say 4 dominant editors in the stats is a more reliable warning sign than "A single editor, or essentially only one editor with tweaking" in most subjects. Nothing is more characteristic of a popular but poor article than a huge list of contributors, all with fewer than 10 edits.

Depends what you are talking about. The point here, and similarly in other places, though, is that it might be good or it might be bad to have a single editor. "More research required", as usual, but you are then researching something definite, and this is what we all do: check out a user page. A machine could at least make a rough guess, and work has been done on reputation systems. The "single-purpose account" is a warning flag, and such editing tends to go with article creation about fringe things where others may hardly bother. As I hinted, there can be false positives with single editors, but it is a useful attack on the issue, surely. The multiple editor phenomenon you are talking about is indeed characteristic of mediocre articles. We have to recall that "good" articles make up less than 1% of Wikipedia's articles.

...

Sadly, the implied notion (at T for Traffic below) that fairly high views automatically lead to increased quality is very dubious - we have plenty of extremely bad articles that have had by now millions of viewers who have between them done next to nothing to improve the now-ancient text.

No, my point was as before, really. Very low traffic and very high traffic tell you something. Mid-range traffic doesn't sort the sheep from the goats.

...

There is also the question of what use the results of the exercise will be.

Oh, I agree. But the current "system" does seem skewed towards recognition of better content, rather than dealing purposefully with the worst 10%. Picking up the latter in some mechanical way is always worth considering. Charles

3685

days inactive

3685

days old

wikimediauk-l@lists.wikimedia.org

Manage subscription

8 comments

6 participants

tags (0)

participants (6)

Charles Matthews
Edward Saperia
Edward Saperia
John Byrne
Rod Ward
Simon Knight