Wikiquality-l December 2007

wikiquality-l@lists.wikimedia.org

18 participants
10 discussions

Re: [Wikiquality-l] Wikipedia colored according to trust

by Jonathan Leybovich

One thing that stood out for me in the small sample of articles I examined was the flagging of innocuous changes by casual users to correct spelling, grammar, etc. Thus a "nice-to-have" would be a "smoothing" algorithm that ignores inconsequential changes such as spelling corrections, etc. or the reordering of semantically-contained units of text (for example, reordering the line items in a list w/o changing the content of any particular line item, etc., or the reordering of paragraphs and perhaps even sentences.) I think this would cover 90% or more of changes that are immaterial to an article's credibility.

16 years, 3 months

Re: [Wikiquality-l] Wikiquality-l Digest, Vol 10, Issue 12

by Jonathan Leybovich

> Date: Fri, 21 Dec 2007 10:34:47 -0800 > From: "Luca de Alfaro" <luca(a)dealfaro.org> > > If you want to pick out the malicious changes, you need to flag also small > changes. > > "Sen. Hillary Clinton did *not* vote in favor of war in Iraq" > > "John Doe, born in *1947*" > > The ** indicates changes. > Yes, and I did not mean to include cases such as this, which involve the insertion of a few words that could radically alter the semantic content of a unit of text. But legitimate spelling corrections (which can be easily determined using any of the various spell-checker databases to determine the set of common misspellings for a word) do not. In short, I cannot imagine a case where someone changing "Senater Clinton" to "Senator Clinton" could involve vandalism (the "smoother" algorithm should of course also take into account that if a "misspelling" appears repeatedly in an article, or even better, related subject articles by different authors, is is probably a valid technical term or a proper name). I also cannot imagine how moving a large block of relatively self-contained text (i.e. a paragraph, since even parsing at the level of sentences is problematic given all the uses for the period '.') without modifying its interior could have any large semantic repercussions (readability is, of course, a matter for a different discussion ;-) Again, these are mainly quibbles, but for the articles I sampled it was quite annoying to have my eye repeatedly drawn to a single orange word that represented nothing more than a minor, good-faith correction. And overall the system seems to work well!

16 years, 4 months

Stable versions

by John Erling Blad

I have to make a note at our signpost about the present state of Stabel versions. Is it coming? When? John E Blad

16 years, 4 months

Wikipedia coloured according to trust

by mike.lifeguard

Is it possible to get some of English Wikibooks up as an experiment? If you recall, I was concerned about how our slow editing rate and small editor community would impact the utility of this implicit system of rating. I'd also like to see how trust varies across modules of a book. Thanks! Mike.lifeguard (@en.wikibooks)

16 years, 4 months

Re: [Wikiquality-l] Wikipedia colored according to trust

by Luca de Alfaro

Dear All, we have a demo at http://wiki-trust.cse.ucsc.edu/ that features the whole English Wikipedia, as of its February 6, 2007 snapshot, colored according to text trust. This is the first time that even we can look at how the "trust coloring" looks on the whole of the Wikipedia! We would be very interested in feedback (the wikiquality-l(a)lists.wikimedia.org mailing list is the best place). If you find bugs, you can email us at http://groups.google.com/group/wiki-trust Happy Holidays! Luca PS: yes, we know, some images look off. It is currently fairly difficult for a site outside of the Wikipedia to fetch Wikipedia images correctly. PPS: there are going to be a few planned power outages on our campus in the next days, so if the demo is off, try again later.

16 years, 4 months

Only 0.0037% internet users arrive to a vandalised article

by Platonides

The following brief was published yesterday on 20minutes (www.20minutos.es) free newspaper at Spain in Spanish. I can send you the original if you want. The numbers are quite interesting. Does anyone know more about that study? =What you need to know about... ...editing Wikipedia= ==Many consult this site, but few encourage to introduce new contents== [[Image:Wikipedia logo]] The best reference work online, made in a disinterested way by the internet users was born in its current identity 15 January 2001. Since then, more than 2 million articles were crated in its English version, and more than 300.000 in the Spanish one. Currentl, it is available at 253 languages. Anyone can modify the articles or create new ones, but according to aninvetigation by the Minnesota University, it's only a few percentage of those who visit it, 'work' at it. Another peculiar data is that a minimun percentage of hte users (the 1%) are responsible of half the contents. The fact that they're few people whoe really feed the contensts don't detract its quality. Teh same study points out that the probability that a user arrives at a few precise article or a vandalised one is very small. of only about 0.0037%. Moreover, 40% of the malicious changes are solved before the article is read by two different users.

16 years, 4 months

Wikimedia Quality Portal translations needed

by Casey Brown

The Wikimedia Quality portal <http://quality.wikimedia.org/> is being mentioned more and more lately. That being said, we should probably work to get more translations made of it so that more users will be able to read and understand it. I have made a list of what translations should be done at <http://quality.wikimedia.org/wiki/User:Cbrown1023/Translations>. However, translations in languages other than those on that page are more than welcome. Currently, the following languages are especially needed: * Japanese - ja - 日本語 * Arabic - ar - العربية * Bengali - bn - বাংলা * Hindi - hi - हिन्दी * Indonesian - id - Bahasa Indonesia * Dutch - nl - Nederlands If you have any questions about the meanings of the words used, feel free to e-mail me or the list. :-) Thanks in advanced for any translations you can do or help with! -- Casey Brown Cbrown1023 --- Note: This e-mail address is used for mailing lists. Personal emails sent to this address will probably get lost.

16 years, 4 months

Re: [Wikiquality-l] Implicit vs Explicit metadata

by Aaron Schulz

Bah, I meant to send this here, not to just one person... OK, in order to talk about pros vs. cons, we need to consider the uses first. Some main tasks are:1) selected an unvandalized version (for AT, this will do "worse part" checks and such)2) selecting a quality, fact-checked version (German Wikipedia wants this)3) selecting a consensus version/marking featured pages4) selecting the best version and displaying it be *default* Selecting an unvandalized versionFlagged Revisions (pros):1) Templates and images are part of the review process, so vandalism to them will not show for reviewed pages2) Users with review rights get the gratification of setting the latest unvandalized/"sighted" version3) New users can look forward to getting these rights in short order, after being considered trusted4) Edits by reviewers can be autoreviewed if they are to a page where the stable and current are synced.5) If 4 above is not possible, a diff of the changes to the stable are shown after edit to reviewers with a review form with the tags preselected. It shouldn't take long at all to glance over the changes and click "review". Flagged Revisions (cons)1) Initial review takes noticeable time for non-stub pages2) Revisions can fall out of date if not maintained, so people clicking to see a stable version may get a really old one. This is integrated with the RC patrolling system and with the autoreviewing/quick diffs to help, but it is still a possibility. Article Trust(pros)1) No workload added, all automatic. This is very nice.2) Fast and fluid since calculations for the sighted version are done on every edit without anyone having to do anything3) Accounts for consensus, so no rouge reviewer can easily flag garbage. Still, a "trusted" user can go rouge and add garbage, even in several edits to bump the trust. Article Trust(cons)1) Template and image vandalism is still a problem2) Bot and AWB edits flying through pages automatically make the trust of large chunks of text increase3) No direct control over it by anyone -> incentive loss Selecting a quality, fact-checked version Flagged Revisions (pros):1) Trusted users, who have some respect for consensus as well, can directly mark off solid revisions Flagged Revisions (cons):1) If the user goes rouge they can flag garbage. Not as bad as rouge admins, likely rare, but something to think about...2) A roughish user may ignore consensus and reasonable fact disputes. This could result in a small user or cabal having a monopoly over the "best version". Good policy standards and respect should be enforced to avoid this. Article Trust(pros):1) Nearly all "white" pages have a good chance of being reasonable accurate2) No work required3) Harder to form cabals/monopolies over the "best version" Article Trust(cons):1) Again, bots and such2) You cannot edit anything without vouching for it (bad for fixing typos) and the text around it. Either people get afraid to edit or dubious text gets more and more "trusted"3) No one necessarily committed to having fact-checked anything Selecting a consensus version/Marking feature pagesFlagged Revisions (pros):1) Trusted reviewers (higher flagging rights than normal reviewers), like bureaucrats, look at debates and see if a consensus for a community selected version exists2) As long as the trusted reviewer acts like most "bureaucrats" on Wikipedia and just measure consensus, it is not easy to game Flagged Revisions (cons):1) Rouge trusted reviewer...blah...could end up at arbcom 2) Not automatic...I mean look at how slow WP:FA stuff is... Article Trust(pros):1) It would take a lot of users to try to edit war to push the "consensus version" around since it is automatic, and that would just make a bunch of red text to the current revision which would cause it not to be selected.2) Automatic, account for all edits, not just those "voting" on some talk page3) Generally waaay faster Article Trust(cons):1) If there is consesus for a version clearly demontrated on a talk page, a small group of editors can still edit war over it and drop the trust. It would be nice for some trusted reviewer to be able to see this and expediently flag it. Selecting the best version and displaying it be *default*Flagged Revisions (pros):1) Again, template/image vandalism won't be such a problem since those are set for each reviewed revision based on how it was when reviewed.2) For bios of living people, we can easily and immediatly set the stable version without having to fiddle around getting it autotrusted.3) The incentive issue again, reviewer can set this, and editors can look forward to becoming reviewers. Flagged Revisions (cons):1) Rouge trusted reviewer...blah...could end up at arbcom 2) Spelling/grammar errors can get stuck if no one is around to review corrections (though reviewers spelling fixes could be autoreviewed sometimes) Article Trust(pros):1) The default is BY FAR the most important revision, so giving direct control over gives incentives to form evil cabals :)2) As this is important, it helps to stay up to date with the workload, this requires none...so that's pretty easy... Article Trust(cons):1) Spelling/grammar errors can get stuck since it's hard to directly control2) The "highest least trusted" and "max age" and other heurestics will be confusing to new users. Default page selection will feel kind of randomThis is still an imcomplete list probably...I should probably save this somewhere and build on it. Also, for default revisions selection on page view, I am just comparing the methods of selection by the two extensions. We could have it where Flagged Revisions does the overriding of the default revision, but that it grabs the Article Trust "most trusted" version rather than some reviewed one. This would just be to avoid duplicated code though.-Aaron Schulz _________________________________________________________________ Connect and share in new ways with Windows Live. http://www.windowslive.com/connect.html?ocid=TXT_TAGLM_Wave2_newways_112007

16 years, 5 months

Implicit vs Explicit metadata

by Waldir Pimenta

I am sure this has already been discussed, but just in case, here goes my two cents: The post in http://breasy.com/blog/2007/07/01/implicit-kicks-explicits-ass/ explains why implicit metadata (like Google's PageRank) are better than explicit metadata (Like Digg votes). Making a comparison to Wikimedia, I'd say that Prof. Luca's trust algorithm is a more reliable way to determine the quality of an article's text than the Flagged Revision Extension. However, the point of the latter is to provide a stable version to the user who chooses that, while the former displays to which degree the info can be trusted, but still showing the untrusted text. What I'd like to suggest is the implementation of a filter based on the trust calculations of Prof. Luca's algorithm, which would use the editors' calculated reliability to automatically choose to display a certain revision of an article. It could be implemented in 3 ways: 1. Show the last revision of an article made by an editor with a trust score bigger than the value that the reader provided. The trusted editor is implicitly setting a minimum quality flag in the article by saving a revision without changing other parts of the text. This is the simpler approach, but it doent prevent untrusted text to show up, in case the trusted editor leaves untrusted parts of the text unchanged. 2. Filter the full history. Basically, the idea is to show the parts of the to the article written by users with a trust score bigger than the value that the reader provided. This would work like slashdot's comment filtering system, for example. Evidently, this is the most complicated approach, since it would require an automated conflict resolution system which might not be possible. 3. A mixed option could be to try to hide revisions by editors with a lower trust value than the threshold set. This could be done as far back in the article history as possible, while a content conflict isn't found. Instead of trust values, this could also work by setting the threshold above unregistered users, or newbies (I think this is approximately equivalent to accounts younger than 4 days) Anyway, these are just rough ideas, on which I'd like to hear your thoughts.

16 years, 5 months

Status of de:wp implementation?

by David Gerard

I haven't had any press queries about this as yet, but if I do: what's the status of putting something into place on de:wp as planned? I understand it's delayed by devs being busy with the WMF fundraiser ... - d.

16 years, 5 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Wikiquality-l December 2007