Wiki-research-l October 2012

wiki-research-l@lists.wikimedia.org

34 participants
28 discussions

by song＠cs.umn.edu

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

9 months, 3 weeks

Statistics on wiki page editing/creation by country?

by alina ostling

Hi! I am doing a PhD on online civic participation project (e-participation). Within my research, I have carried out a user survey, where I asked how many people ever edited/created a page on a Wiki. Now I would like to compare the results with the overall rate of wiki editing/creation on country level. I've found some country-level statistics on Wikipedia Statistics (e.g. 3,000 editors of Wikipedia articles in Italy) but data for UK and France are not available since Wikipedia provides statistics by languages, not by countries. I'm thus looking for statistics on UK and France (but am also interested in alternative ways of measuring wiki editing/creation in Sweden and Italy). I would be grateful for any tips! Sunny regards, Alina -- Alina ÖSTLING PhD Candidate European University Institute www.eui.eu

9 years, 7 months

WikiPapers has now over 1,000 publications

by emijrp

Hi all; WikiPapers has reached recently the 1,000 publications milestone.[1] Looks like the publication rate peaked in 2009 and has plateaued in the last 3 years. I continue adding more data... but with little help. Don't you like editing wikis? ; ) Regards, emijrp [1] http://wikipapers.referata.com/wiki/List_of_publications -- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOT <http://code.google.com/p/avbot/> | StatMediaWiki<http://statmediawiki.forja.rediris.es> | WikiEvidens <http://code.google.com/p/wikievidens/> | WikiPapers<http://wikipapers.referata.com> | WikiTeam <http://code.google.com/p/wikiteam/> Personal website: https://sites.google.com/site/emijrp/

10 years, 6 months

A wiki search engine

by emijrp

Hi all; I'm starting a new project, a wiki search engine. It uses MediaWiki, Semantic MediaWiki and other minor extensions, and some tricky templates and bots. I remember Wikia Search and how it failed. It had the mini-article thingy for the introduction, and then a lot of links compiled by a crawler. Also something similar to a social network. My project idea (which still needs a cool name) is different. Althought it uses an introduction and images copied from Wikipedia, and some links from the "External links" sections, it is only a start. The purpose is that community adds, removes and orders the results for each term, and creates redirects for similar terms to avoid duplicates. Why this? I think that Google PageRank isn't enough. It is frequently abused by farmlinks, SEOs and other people trying to put their websites above. Search "Shakira" in Google for example. You see 1) Official site, 2) Wikipedia 3) Twitter 4) Facebook, then some videos, some news, some images, Myspace. It wastes 3 or more results in obvious nice sites (WP, TW, FB). The wiki search engine puts these sites in the top, and an introduction and related terms, leaving all the space below to not so obvious but interesting websites. Also, if you search for "semantic queries" like "right-wing newspapers" in Google, you won't find real newspapers but "people and sites discussing about ring-wing newspapers". Or latex and LaTeX being shown in the same results pages. These issues can be resolved with disambiguation result pages. How we choose which results are above or below? The rules are not fully designed yet, but we can put official sites in the first place, then .gov or .edu domains which are important ones, and later unofficial websites, blogs, giving priority to local language, etc. And reaching consensus. We can control aggresive spam with spam blacklists, semi-protect or protect highly visible pages, and use bots or tools to check changes. It obviously has a CC BY-SA license and results can be exported. I think that this approach is the opposite to Google today. For weird queries like "Albert Einstein birthplace" we can redirect to the most obvious results page (in this case Albert Einstein) using a hand-made redirect or by software (some little change in MediaWiki). You can check a pretty alpha version here http://www.todogratix.es (only Spanish by now sorry) which I'm feeding with some bots. I think that it is an interesting experiment. I'm open to your questions and feedback. Regards, emijrp -- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOT <http://code.google.com/p/avbot/> | StatMediaWiki<http://statmediawiki.forja.rediris.es> | WikiEvidens <http://code.google.com/p/wikievidens/> | WikiPapers<http://wikipapers.referata.com> | WikiTeam <http://code.google.com/p/wikiteam/> Personal website: https://sites.google.com/site/emijrp/

10 years, 9 months

War of 1812 and all that

by Richard Jensen

I was the one who raised the 1812 example in the context of Wikipedia's coverage of military history; see Richard Jensen, "Military History on the Electronic Frontier: Wikipedia Fights the War of 1812," ''The Journal of Military History'' 76#4 (October 2012): 523-556; the page proofs (with some typos) are online at http://www.americanhistoryprojects.com/downloads/JMH1812.PDF My argument is that Wikipedia is written by and for the benefit of a few thousand editors -- what the readers or the general public wants or thinks or uses is largely irrelevant. The growth then depends on the need to recruit new editors -- using the details from the 1812 article I suggest that fewer and fewer new editors are actually interested. (I also looked at other major articles on WWI, WWII, the American Civil War & others and found the same pattern.) Look at it demographically: apart from teenage boys coming of age, the population of computer-literate people who are ignorant of Wikipedia is very small indeed in 2012. That was not true in 2005 when lots of editors joined up and did a lot of work on important articles. So I think that military history at Wikipedia is pretty well saturated. That does not mean there are not more possible topics (we have about 130,000 articles (including stubs) now and major libraries will own maybe 100,000+ full length books on military topics). I suggest that new editors need to have an attractive new niche that is not now well covered. I suggest that they will have a very hard time finding such a niche that allows for the excitement of new writing about important topics. (such as took place in back in 2005-2007). Personally I greatly enjoyed writing about George Washington and Ulysses Grant and Napoleon--that's why I'm here. I would have trouble explaining to someone why they should write up general #1001, #1002, #1103 ... let alone colonel #10,001, 10,002, 10,003 .... Richard Jensen User:Rjensen email rjensen(a)uic.edu

11 years, 6 months

Minor stats on Wikipedia

by Piotr Konieczny

Would anyone have/know where to find any of the following estimates for English Wikipedia, either as a number or as % of the total population of editors (which is known): * of people who edited Wikipedia anonymously * of Wikipedians with a userpage * of Wikipedians who have been registered for less than a year * of Wikipedians who have been registered for less than a month The data does not have to be current. -- Piotr Konieczny "To be defeated and not submit, is victory; to be victorious and rest on one's laurels, is defeat." --Józef Pilsudski

11 years, 6 months

New paper: Wikipedia. Between lay participation and elite knowledge representation

by René König

Dear colleagues, My paper "Wikipedia. Between lay participation and elite knowledge representation" has just been published at Information, Communication & Society. I´d be interested in your thoughts. Contact me if you don´t have access to that journal. Best, René Abstract The decentralized participatory architecture of the Internet challenges traditional knowledge authorities and hierarchies. Questions arise about whether lay inclusion helps to ‘democratize’ knowledge formation or if existing hierarchies are re-enacted online. This article focuses on Wikipedia, a much-celebrated example which gives an in-depth picture of the process of knowledge production in an open environment. Drawing on insights from the sociology of knowledge, Wikipedia's talk pages are conceptualized as an arena where reality is socially constructed. Using grounded theory, this article examines the entry for the September 11 attacks and its related talk pages in the German Wikipedia. Numerous alternative interpretations (labeled as ‘conspiracy theories’) that fundamentally contradict the account of established knowledge authorities regarding this event have emerged. On the talk pages, these views collide, thereby serving as a useful case study to examine the role of experts and lay participants in the process of knowledge construction on Wikipedia. The study asks how the parties negotiate ‘what actually happened’ and which knowledges should be represented in the Wikipedia entry. The conflicting points of view overload the discursive capacity of the contributors. The community reacts by marginalizing opposing knowledge and protecting or immunizing the article against these disparate views. This is achieved by rigorously excluding knowledge which is not verified by external expert authorities. Therefore, in this case, lay participation did not lead to a ‘democratization’ of knowledge production, but rather re-enacted established hierarchies. http://www.tandfonline.com/doi/full/10.1080/1369118X.2012.734319 --- René König, Dipl.-Soz. Karlsruhe Institute of Technology Institute for Technology Assessment and Systems Analysis (ITAS) P.O. Box 3640 76021 Karlsruhe Germany Tel.: +49 (0) 721 / 608-22209 Web/Skype: renekoenig.eu Twitter: r_koenig

11 years, 6 months

Updated Wikipedia literature review working paper

by Chitu Okoli

11 years, 6 months

Total number of notable subjects?

by Yaroslav M. Blanter

We have a new article in The Atlantic, http://www.theatlantic.com/technology/archive/2012/10/surmounting-the-insur… (which btw I found following Dario's twitter, @ReaderMeter, which I recommend) and this is still the same story of whether we achieved the limit of what can be written etc). Without going into details of this animated debate (I have smth to say, for instance, I just created two articles which have about a hundred red links, and the material to fill in these red links is available, but this will lead us away from the topic), I am curious, if anybody ever tried to estimate what is the possible number of notable topics for articles. On the short time scale, it should grow linearly with time, since we have new sports events, elections, TW shows, movies, books etc, and many persons who previously not been notable become notable. Thus, this number must be N = a + b (t-2012), where a is the number of topics notable now, t is the time in years, and b is the number of new topics which become notable every year. Was there any research on what order of magnitude a and b have? I guess b must be in the order of dozens of thousands, since we are talking about people. What is a? Is it dominated by the number of species of insects, or cosmic bodies, or what? I tried to ask this question several years ago in Russian Wikipedia, but there was no concluding answer. Cheers Yaroslav

11 years, 6 months

1-year dump of English Wikipedia article ratings

by Dario Taraborelli

We've released a full, anonymized dump of article ratings (aka AFTv4) collected over 1 year since the deployment of the tool on the entire English Wikipedia (July 22, 2011 - July 22, 2012). http://thedatahub.org/en/dataset/wikipedia-article-ratings The dataset (which includes 11m unique article ratings along 4 dimensions) is licensed under CC0 and supersedes the partial dumps originally hosted on the dumps server. Real-time AFTv4 data remains available as usual via the toolserver. Feel free to get in touch if you have any questions about this data. Dario

11 years, 6 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l October 2012