Wiki-research-l July 2010

wiki-research-l@lists.wikimedia.org

31 participants
16 discussions

by song＠cs.umn.edu

Pursuant to prior discussions about the need for a research policy on Wikipedia, WikiProject Research is drafting a policy regarding the recruitment of Wikipedia users to participate in studies. At this time, we have a proposed policy, and an accompanying group that would facilitate recruitment of subjects in much the same way that the Bot Approvals Group approves bots. The policy proposal can be found at: http://en.wikipedia.org/wiki/Wikipedia:Research The Subject Recruitment Approvals Group mentioned in the proposal is being described at: http://en.wikipedia.org/wiki/Wikipedia:Subject_Recruitment_Approvals_Group Before we move forward with seeking approval from the Wikipedia community, we would like additional input about the proposal, and would welcome additional help improving it. Also, please consider participating in WikiProject Research at: http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Research -- Bryan Song GroupLens Research University of Minnesota

9 months, 3 weeks

WikiCite - new WMF project? Was: UPEI's proposal for a "universal citation index"

by Brian J Mingus

I have been working with Sam and others for some time now on brainstorming a proposal for the Foundation to create a centralized wiki of citations, a WikiCite so to speak, if that is not the eventual name. My plan is to continue to discuss with folks who are knowledgeable and interested in such a project and to have the feedback I receive go into the proposal which I hope to write this summer. The proposal white paper will then be sent around to interested parties for corrections and feedback, including on-wiki and mailing lists, before eventually landing at the Foundation officially. As we know WMF has not started a new project in some years, so there is no official process. Thus I find it important to get it right. The basic idea is a centralized wiki that contains citation information that other MediaWikis and WMF projects can then reference using something like a {{cite}} template or a simple link. The community can document the citation, the author, the book etc.. and, in one idealization, all citations across all wikis would point to the same article on WikiCite. Users can use this wiki as their personal bibliography as well, as collections of citations can be exported in arbitrary citation formats. This general plan would allow community aggregation of metadata and community documentation of sources along arbitrary dimensions (quality, trust, reliability, etc.). The hope is that such a resource would then expand on that wiki and across the projects into summarizations of collections of sources (lit reviews) that make navigating entire fields of literature easier and more reliable, getting you out of the trap of not being aware of the global context that a particular source sits in. To give all a more concrete view, here is an example from some software that I have implemented in our lab called WikiPapers. Please take note that while this is a scientific literature example, the idea is general to *all publications ever*. Also, while I have implemented a feature-full version of a WikiCite, it's important to point out that for the WMF project we will need a new extension that handles the needs of the project exactly, and in PHP (I use Python :). The name of the wiki article is a unique key that is a combination of the author names and the year, in the following format: Author1Author2Author3EtAl10b. This works for scientific articles, but we may find we need to modify the key for other kinds of sources. The content of the wiki article is composed of an infobox constructed via the Citation template, and any other text and media the community determines it is useful and legal to include in the article. Example article: Screenshot of how this infobox renders on our wiki: http://grey.colorado.edu/mediawiki/sites/mingus/images/0/0e/KangHsuKrajbich… Title: KangHsuKrajbichEtAl09 {{Citation |publisher=SAGE Publications |dateadded=2010-07-17 |author=Kang M.J. and Hsu M. and Krajbich I.M. and Loewenstein G. and McClure S.M. and Wang J.T. and Camerer C.F. |url=http://pss.sagepub.com/content/20/8/963.full |abstract=Curiosity has been described as a desire for learning and knowledge, but its underlying mechanisms are not well understood. We scanned subjects with functional magnetic resonance imaging while they read trivia questions. The level of curiosity when reading questions was correlated with activity in caudate regions previously suggested to be involved in anticipated reward. This finding led to a behavioral study, which showed that subjects spent more scarce resources (either limited tokens or waiting time) to find out answers when they were more curious. The functional imaging also showed that curiosity increased activity in memory areas when subjects guessed incorrectly, which suggests that curiosity may enhance memory for surprising new information. This prediction about memory enhancement was confirmed in a behavioral study: Higher curiosity in an initial session was correlated with better recall of surprising answers 1 to 2 weeks later. |title=The Wick in the Candle of Learning |bibtex type=article |number=8 |volume=20 |owner=Sethherd |journal=Psychological Science |year=2009 |cites=O'ReillyFrank06,Cowan95,Wise04,Fuster80,Panksepp98,KakadeDayan02b,DelgadoLockeStengerEtAl03,BrewerZhaoDesmondEtAl98,DelgadoNystromFiez00,Beatty82,Baddeley92,Waanabe96,Roland93lm,DelgadoNystromFissellEtAl00,WagnerSchacterRotteEtAl98,SeymourDawDayanEtAl07,ODoherty04,BandettiniMoonen99,ODohertyDayanFristonEtAl03,RogersOwenRobbins99,KnutsonWestdorpKaiserEtAl00,CircuitryMemory,OReillyFrank06,Watanabe96a,BrewerZhaoGabrieli98,WagnerSchacterBuckner98,RogersOwenMiddletonEtAl99,Baddeley86,Watanabe96,Rolls96a,PallerWagner02 |cited_by=Author1Author2Author3EtAl10,etc... |pages=963 }} Then, any other WMF wiki, or any other MediaWiki, could cite this universal entry by simply typing {{cite|KangHsuKrajbichEtAl09}} Additionally, if a technology such as Semantic MediaWiki is used (as it is in WikiPapers), arbitrary lists of collections of literature can be generated by constructing simple queries that are boolean combinations of template properties. Given that SMW does not scale well, I have a plan that uses Lucene instead for fast, scalable dynamic generation of collections of citations. Imagine the possibilities.. Feel free to provide your feedback on this idea, in addition to your own ideas, in this thread, or to me personally. I am especially interested in the potential benefits to the WMF projects that you see, and to hear your thoughts on the potential of this project on its own, as that will feature prominently in the proposal. Additionally, what do you think WikiCite would eventually be like, once it is fully matured? Brian Mingus Graduate Student Computational Cognitive Neuroscience Lab University of Colorado at Boulder On Mon, Jul 19, 2010 at 11:22 AM, phoebe ayers <phoebe.wiki(a)gmail.com>wrote: > There have been a number of proposals floated in the Wikimedia > community over the years to build a wiki-based project for collecting > journal citation information. For those interested in that topic, you > might want to check out the University of Prince Edward Island's > "knowledge for all" project proposal -- it proposes to build an open > universal citation index (to serve as an alternative to the many > hundreds of proprietary citation index products that libraries > currently buy). This of course is not the first attempt at this > problem, but it's an interesting proposal that's getting a bit of buzz > in the library community. > http://library.upei.ca/k4all > > -- phoebe > > -- > * I use this address for lists; send personal messages to phoebe.ayers > <at> gmail.com * > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >

13 years, 9 months

Pageview data lost to packet loss

by Erik Moeller

For those of you working with pageview data from <http://dammit.lt/wikistats> (also the source for stats.grok.se), please note that over the last several weeks, a significant percentage (about a third) of pageviews weren't being logged due to packet loss on the aggregating server. The setup is still pretty fragile, so please be wary of taking that data too seriously until we've stabilized it. -- Erik Möller Deputy Director, Wikimedia Foundation Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

13 years, 9 months

Call for Participation: 2nd Workshop on Collaboratively Constructed Semantic Resources - COLING 2010

by Torsten Zesch

================================================================== CALL FOR PARTICIPATION COLING 2010 Workshop The 2nd Workshop on "The People's Web meets NLP: Collaboratively Constructed Semantic Resources" http://www.ukp.tu-darmstadt.de/scientific-community/coling-2010-workshop/ ----------------------------------------------------------------- Beijing, China, August, 28, 2010 COLING 2010 KEYWORDS: Wikipedia, Wiktionary, Mechanical Turk, Games with a purpose, Folksonomies, Twitter, Social Networks INVITED TALK Tat-Seng Chua, National University of Singapore REGISTRATION http://www.coling-2010.org/Registration.htm PRELIMINARY PROGRAM 9:15-9:30 Opening Remarks 9:30-10:00 Constructing Large-Scale Person Ontology from Wikipedia, Yumi Shibaki, Masaaki Nagata and Kazuhide Yamamoto 10:00-10:30 Using the Wikipedia Link Structure to Correct the Wikipedia Link Structure, Benjamin Mark Pateman and Colin Johnson 10:30-11:00 Coffee Break 11:00-11:30 Extending English ACE 2005 Corpus Annotation with Ground-truth Links to Wikipedia, Luisa Bentivogli, Pamela Forner, Claudio Giuliano, Alessandro Marchetti, Emanuele Pianta and Kateryna Tymoshenko 11:30-12:00 Expanding textual entailment corpora from Wikipedia using co-training, Fabio Massimo Zanzotto and Marco Pennacchiotti 12:00-12:30 Pruning Non-Informative Text Through Non-Expert Annotations to Improve Aspect-Level Sentiment Classification, Ji Fang, Bob Price and Lotti Price 12:30-14:00 Lunch Break 14:00-15:00 Invited Talk by Tat-Seng Chua, National University of Singapore 15:00-15:30 Measuring Conceptual Similarity by Spreading Activation over Wikipedia's Hyperlink Structure, Stephan Gouws, G-J van Rooyen and Herman A. Engelbrecht 15:30-16:00 Coffee Break 16:00-16:30 Identifying and Ranking Topic Clusters in the Blogosphere, M. Atif Qureshi, Arjumand Younus, Muhammad Saeed, Nasir Touheed, Emanuele Pianta and Kateryna Tymoshenko 16:30-16:50 Helping Volunteer Translators, Fostering Language Resources, Masao Utiyama, Takeshi Abekawa, Eiichiro Sumita and Kyo Kageura 16:50-17:30 Discussion ORGANIZERS Iryna Gurevych Torsten Zesch Ubiquitous Knowledge Processing Lab Technische Universität Darmstadt, Germany PROGRAM COMMITTEE Andras Csomai Google Inc. Anette Frank Heidelberg University Benno Stein Bauhaus University Weimar Bernardo Magnini ITC-irst Trento Christiane Fellbaum Princeton University Dan Moldovan University of Texas at Dallas Delphine Bernhard LIMSI-CNRS, Orsay Diana McCarthy Lexical Computing Ltd Elke Teich Technische Universität Darmstadt Emily Pitler University of Pennsylvania Eneko Agirre University of the Basque Country Erhard Hinrichs Eberhard Karls Universität Tübingen Ernesto De Luca Technische Universität Berlin Florian Laws University of Stuttgart Gerard de Melo MPI Saarbrücken German Rigau University of the Basque Country Graeme Hirst University of Toronto Günter Neumman DFKI Saarbrücken György Szarvas Technische Universität Darmstadt Hans-Peter Zorn European Media Lab, Heidelberg José Iria University of Sheffield Laurent Raumary LORIA, Nancy Magnus Sahlgren Swedish Institute of Computer Science Manfred Stede Potsdam University Omar Alonso A9.com, Inc. Pablo Castells Universidad Autónonoma de Madrid Paul Buitelaar DERI, National University of Ireland, Galway Philipp Cimiano Delft University of Technology Razvan Bunescu University of Texas at Austin Rene Witte Concordia University Montréal Roxana Girju University of Illinois at Urbana-Champaign Saif Mohammad University of Maryland Samer Hassan University of North Texas Sören Auer Leipzig University Tonio Wandmacher CEA, Paris INTRODUCTION The workshop builds upon the success of the first ACL "The People's Web meets NLP" Workshop in 2009 that attracted 21 submissions. Accepted submissions included papers on Wikipedia [1], Wiktionary [2], Mechanical Turk [3], and game-based construction of semantic resources [4]. This clearly demonstrates a substantial and growing interest of the NLP community in collaboratively constructed semantic resources (CSRs), also evidenced by the increasing number of publications in this area and the EMNLP 2009 Web 2.0 track. In many works, CSRs have been used to overcome the knowledge acquisition bottleneck and coverage problems pertinent to conventional lexical semantic resources. The greatest popularity in this respect can so far certainly be attributed to Wikipedia [1]. However, other resources, such as folksonomies or the multilingual collaboratively constructed dictionary Wiktionary, have also shown great potential. Thus, the scope of the workshop deliberately includes any collaboratively constructed resource, not only Wikipedia. Effective deployment of CSRs to enhance NLP introduces a pressing need to address a set of fundamental challenges, e.g. the interoperability with existing resources, or the quality of the extracted lexical semantic knowledge. Interoperability between resources is crucial as no single resource provides perfect coverage. The quality of CSRs is a fundamental issue, as they lack editorial control and entries are often incomplete. Thus, techniques for link prediction [5] or information extraction [6] have been proposed to guide the "crowds" while constructing resources of better quality. [1] Olena Medelyan, David Milne, Catherine Legg and Ian H. Witten. Mining meaning from Wikipedia. In: International Journal of Human-Computer Studies. 67(9), 2009. [2] Torsten Zesch, Christof Mueller and Iryna Gurevych Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary Proceedings of the Conference on Language Resources and Evaluation (LREC), 2008. http://www.ukp.tu-darmstadt.de/software/jwpl/ http://www.ukp.tu-darmstadt.de/software/jwktl/ [3] Rion Snow, Brendan O'Connor, Daniel Jurafsky and Andrew Y. Ng. Cheap and Fast---But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks. Proceedings of EMNLP. 2008. [4] Luis von Ahn and Laura Dabbish. General Techniques for Designing Games with a Purpose. Communications of the ACM, 2008. [5] Rada Mihalcea and Andras Csomai Wikify!: Linking Documents to Encyclopedic Knowledge. Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007. [6] Daniel S. Weld et al. Intelligence in Wikipedia. Twenty-Third Conference on Artificial Intelligence (AAAI), 2008.

13 years, 9 months

What data would you like Wikimedia to provide?

by Daniel Kinzler

HI all At WikiSym, there where two sessions about what data researches would like to have for their work: <http://www.wikisym.org/ws2010/What+data+to+collect> and <http://www.wikisym.org/ws2010/XML+dumps+and+research+needs+for+WMF+projects>. >From that, Ariel Glenn compiled a wiki page <http://www.mediawiki.org/wiki/Research_Data_Proposals> and asks reasearchers for input about what data they would need. So, go ahead, add to the wish list! -- daniel

13 years, 9 months

Moving forward

by Jodi Schneider

While looking through strategy wiki I noticed this proposal: http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_book… Perhaps it's worth putting up a proposal at strategy wiki, or modifying this one? See http://strategy.wikimedia.org/wiki/Call_for_proposals Also related: 6 proposals I've categorized as related to citations: http://strategy.wikimedia.org/wiki/Category:Proposals_related_to_citations -Jodi

13 years, 9 months

Generate Wikipedia citations on Open Library

by Edward Betts

In reference to the discussion about citations, we're recently added a 'Wikipedia citation' link to Open Library. For example: http://openlibrary.org/books/OL17963918M/An_inland_voyage At the bottom of the page on the right is this: Download catalog record: RDF / JSON | Wikipedia citation The Wikipedia citation link will give you a citation template to copy and paste to Wikipedia. We would welcome any comments about this citation template. We would like to have a list on the page, "what cites this book" for citations in Wikipedia and elsewhere on the web. -- Edward.

13 years, 9 months

page numbers

by Jodi Schneider

Jeff makes some good points about page numbers on public-lld (where I had forwarded part of this conversation). -Jodi Begin forwarded message: > Resent-From: public-lld(a)w3.org > From: "Young,Jeff (OR)" <jyoung(a)oclc.org> > Date: 20 July 2010 22:53:40 GMT+01:00 > To: "Tom Morris" <tfmorris(a)gmail.com> > Cc: "Karen Coyle" <kcoyle(a)kcoyle.net>, "Jodi Schneider" <jschneider(a)pobox.com>, "public-lld" <public-lld(a)w3.org>, "Code for Libraries" <CODE4LIB(a)listserv.nd.edu>, "Brian Mingus" <Brian.Mingus(a)colorado.edu> > Subject: RE: "universal citation index" > > I suspect this discussion happened on code4lib before the thread got > cross-posting to LLD XG where I first saw it. > > There are undoubtedly a ton of diverse use cases, but that doesn't mean > APIs are the best solution. Here are some spitball possibilities for > "not just manifestations" and "we need page numbers". > > http://example.org/frbr:serial/2/citation-apa.{bcp-47}.txt > http://example.org/frbr:manifestation/1/citation-apa.{bcp-47}.txt?xyz:st > artPage=5&xyz:endPage=6 > > I'm imagining an xyz ontology with startPage and endPage, but we can > surely create it if something doesn't already exist. > > Jeff > >> -----Original Message----- >> From: Tom Morris [mailto:tfmorris@gmail.com] >> Sent: Tuesday, July 20, 2010 5:37 PM >> To: Young,Jeff (OR) >> Cc: Karen Coyle; Jodi Schneider; public-lld; Code for Libraries; Brian >> Mingus >> Subject: Re: "universal citation index" >> >> On Tue, Jul 20, 2010 at 1:40 PM, Young,Jeff (OR) <jyoung(a)oclc.org> >> wrote: >>> In terms of Linked Data, it should make sense to treat citations as >>> text/plain variant representations of a FRBR Manifestation. >> >> As Karen mentioned, many types of citation need more information than >> just the manifestation. You also need pages numbers, etc. >> >> Tom > > >

13 years, 9 months

UPEI's proposal for a "universal citation index"

by phoebe ayers

There have been a number of proposals floated in the Wikimedia community over the years to build a wiki-based project for collecting journal citation information. For those interested in that topic, you might want to check out the University of Prince Edward Island's "knowledge for all" project proposal -- it proposes to build an open universal citation index (to serve as an alternative to the many hundreds of proprietary citation index products that libraries currently buy). This of course is not the first attempt at this problem, but it's an interesting proposal that's getting a bit of buzz in the library community. http://library.upei.ca/k4all -- phoebe -- * I use this address for lists; send personal messages to phoebe.ayers <at> gmail.com *

13 years, 9 months

Re: [Wiki-research-l] [Foundation-l] Statistics on edit summary usage on Wikipedia andstandard summaries buttons

by Federico Leva (Nemo)

Przykuta, 18/07/2010 13:38: > Huh. Try edit any article in pl wiki as IP without edit summary ;) This question (ask for summaries) is introduced to pl wiki from cs wiki. Script (red border) has written by Nux (probably). Thank you. When was this feature introduced? Ortega's statistics http://wikimania2010.wikimedia.org/wiki/File:Felipe_Ortega,_Flagged_revisio… show that the mandatory edit summary on de.wiki (which is a different feature) reduced anonymous contributions a lot. Nemo

13 years, 9 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l July 2010