Wiki-research-l May 2008

wiki-research-l@lists.wikimedia.org

13 participants
9 discussions

by Erik Moeller

For those of you who haven't seen it, take a look at Domas' Mituzas wiki-stats: http://dammit.lt/wikistats/ This is real, accurate hourly snapshot data on the access to Wikipedia captured from the Wikimedia Squid servers. Project counts show the total access in a time period to the different language editions. This is great stuff for visualization, behavioral pattern analysis, and other purposes. If you do something with it, let us know. :-) URL may change in the future - we'll put a redirect on the above one if that happens. -- Erik Möller Deputy Director, Wikimedia Foundation Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

15 years, 7 months

Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

by Desilets, Alain

> I posted it here: > > http://www.wiki-translation.com/tiki- > index.php?page=DanielKinzlerThesis&bl=n > > You can see the English translation here: > > http://translate.google.com/translate?u=http%3A%2F%2Fwww.wiki- > translation.com%2Ftiki- > index.php%3Fpage%3DDanielKinzlerThesis%26bl%3Dn&hl=en&ie=UTF8&sl=de&tl= > en Hum... If I go to the above translation link, only the first bit is actually translated. I guess Google gives up after a while and leaves the rest in German. If you can split it into separate HTML pages, it would make it easier for people to read it with Google translate. Alain

15 years, 11 months

thesis: automatically building a multilingual thesaurus from wikipedia

by Daniel Kinzler

My diploma thesis about a system to automatically build a multilingual thesaurus from wikipedia, "WikiWord", is finally done. I handed it in yesterday. My research will hopefully help to make Wikipedia more accessible for automatic processing, especially for applications natural languae processing, machine translation and information retrieval. What this could mean for Wikipedia is: better search and conceptual navigation, tools for suggesting categories, and more. Here's the thesis (in German, i'm afraid): <http://brightbyte.de/DA/WikiWord.pdf> Daniel Kinzler, "Automatischer Aufbau eines multilingualen Thesaurus durch Extraktion semantischer und lexikalischer Relationen aus der Wikipedia", Diplomarbeit an der Abteilung für Automatische Sprachverarbeitung, Institut für Informatik, Universität Leipzig, 2008. For the curious, http://brightbyte.de/DA/ also contains source code and data. See <http://brightbyte.de/page/WikiWord> for more information. Some more data is for now avialable at <http://aspra27.informatik.uni-leipzig.de/~dkinzler/rdfdumps/>. This includes full SKOS dumps for en, de, fr, nl, and no covering about six million concepts. The thesis ended up being rather large... 220 pages thesis and 30k lines of code. I'm plannign to write a research paper in english soon, which will give an overview over WikiWord and what it can be used for. The thesis is licensed under the GFDL, WikiWord is GPL software. All data taken or derived from wikipedia is GFDL. Enjoy, Daniel

15 years, 11 months

thesis: automatically building a multilingual thesaurus from wikipedia

by Luca de Alfaro

This looks very interesting! Is this a thesaurus that can be used for translation of words across languages? Is there some way to quickly have a demo or view the data? I browsed some files, and I see entries of the kind: :xf5bfa ww:displayLabel "de:Feliner_Diabetes_mellitus" . :xf5bfa ww:type wwct:OTHER . :xf5bfa rdf:type skos:Concept . :xf5bfa skos:inScheme < http://brightbyte.de/vocab/wikiword/dataset/*/animals:thesaurus which tells me that Diabetes Mellitus of a feline is a concept... I was interested in the animal thesaurus as a way to translate animal names across languages... there are a lot of files, and I don't know if I am looking at the right ones. Perhaps if you pointed us to the most interesting / understandable datasets, it would be very useful. I am sorry if the above remarks seem superficial; I cannot read German well enough to read dissertations in it... Best, Luca. On Fri, May 30, 2008 at 2:54 AM, Daniel Kinzler <daniel(a)brightbyte.de> wrote: > My diploma thesis about a system to automatically build a multilingual > thesaurus > from wikipedia, "WikiWord", is finally done. I handed it in yesterday. My > research will hopefully help to make Wikipedia more accessible for > automatic > processing, especially for applications natural languae processing, machine > translation and information retrieval. What this could mean for Wikipedia > is: > better search and conceptual navigation, tools for suggesting categories, > and more. > > Here's the thesis (in German, i'm afraid): < > http://brightbyte.de/DA/WikiWord.pdf> > > Daniel Kinzler, "Automatischer Aufbau eines multilingualen Thesaurus durch > Extraktion semantischer und lexikalischer Relationen aus der Wikipedia", > Diplomarbeit an der Abteilung für Automatische Sprachverarbeitung, > Institut > für Informatik, Universität Leipzig, 2008. > > For the curious, http://brightbyte.de/DA/ also contains source code and > data. > See <http://brightbyte.de/page/WikiWord> for more information. > > Some more data is for now avialable at > <http://aspra27.informatik.uni-leipzig.de/~dkinzler/rdfdumps/<http://aspra27.informatik.uni-leipzig.de/%7Edkinzler/rdfdumps/>>. > This includes > full SKOS dumps for en, de, fr, nl, and no covering about six million > concepts. > > The thesis ended up being rather large... 220 pages thesis and 30k lines of > code. I'm plannign to write a research paper in english soon, which will > give an > overview over WikiWord and what it can be used for. > > The thesis is licensed under the GFDL, WikiWord is GPL software. All data > taken > or derived from wikipedia is GFDL. > > > Enjoy, > Daniel > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >

15 years, 11 months

Three techreps: assigning trust to Wikipedia content, and reputation, contributions of authors

by Luca de Alfaro

Dear All, we have three new techreps available: - Robust Content-Driven Reputation<http://www.soe.ucsc.edu/%7Eluca/papers/08/ucsc-soe-08-09.html>shows that the content-driven reputation we proposed in a WWW 2007 paper can be made robust to Sybil ("sock-puppet") and other coordinated attacks. In WWW 2007, we proposed "content-driven reputation" for Wikipedia authors, where authors gain reputation if their contributions are preserved, and lose reputation if their contributions are quickly undone. The original algorithms were very prone to attacks; we show here that they can be made resistant. - Assigning Trust to Wikipedia Content<http://www.soe.ucsc.edu/%7Eluca/papers/08/ucsc-soe-08-07.html>proposes computing the trust of Wikipedia text on the basis of the reputation of the author, and the reputation of the people who revised the text. We display text trust by coloring text background. Many of you have seen the on-line demo for the English Wikipedia, at http://trust.cse.ucsc.edu/ . This is an improved version of a November 2007 techrep on the same topic. In this improved techrep, we show how the trust system can be made resistant to attacks. - Measuring Author Contributions to the Wikipedia<http://www.soe.ucsc.edu/%7Eluca/papers/08/ucsc-soe-08-08.html>defines and compares various ways for measuring the contribution of individual authors to the Wikipedia. We have our own favorite; read more to find out :-) In these months, we have been busy working at WikiTrust<http://trust.cse.ucsc.edu/>, an open-source tool for assigning reputation to wiki authors and trust to wiki content. We already have a batch (or "off-line") system, which can compute reputation and trust based on wiki dumps, such as the Wikipedia dumps made available by the Wikimedia Foundation. We are developing an "on-line" system, which can assign reputation and trust in real-time, as edits are made. One of our chief concerns in developing an on-line system was to ensure that it was robust to attack, and we believe we have made progress in this direction, as reported in the above techreps. We are now proceeding with the implementation; my guess is that we will have a prototype in a month or so. By the way, the "batch" part of WikiTrust <http://trust.cse.ucsc.edu/> can be easily adapted to carry out various analysis tasks. Basically, it walks over all revisions of every page of a wiki, and it contains an efficient text analysis engine that tells you precisely how text was changed between versions. So, it is easy to use WikiTrust as a platform to write analysis algorithms for wikis: you don't have to worry about the boring tasks of reading and parsing markup language, and computing text diffs in a reasonable way; you can concentrate on the details of the specific analysis you want to do. It is all open source, and we welcome developers or people interested in it. All the best, Luca (with Ian, Bo, and the other wikitrusters).

15 years, 11 months

[Fwd: FW: [Call for participation] Wikipedia and AI: An Evolving Synergy]

by Jimmy Wales

Probably I should not be list owner... I will tend to be slow and might accidentally miss osmething like this... -------- Original Message -------- Subject: FW: [Call for participation] Wikipedia and AI: An Evolving Synergy Date: Wed, 21 May 2008 18:39:04 -0700 From: Evgeniy Gabrilovich <gabr(a)yahoo-inc.com> To: <wiki-research-l-owner(a)lists.wikimedia.org> Hi, Would you be so kind to check why I'm not allowed to post a message to wiki-research-l ? I'm trying to post a CFP for our workshop, and in the past I was able to post similar messages, but this time my request was declined for some reason. Thank you in advance, Evgeniy. -- Evgeniy Gabrilovich, Ph.D. Senior Research Scientist Yahoo! Research 2821 Mission College Blvd, Santa Clara, CA 95054 Email: gabr(a)yahoo-inc.com Phone: (office) 408-349-8155 (cell) 408-218-7284 > -----Original Message----- > From: wiki-research-l-bounces(a)lists.wikimedia.org > [mailto:wiki-research-l-bounces@lists.wikimedia.org] On > Behalf Of wiki-research-l-owner(a)lists.wikimedia.org > Sent: Wednesday, May 21, 2008 18:36 > To: Evgeniy Gabrilovich > Subject: [Call for participation] Wikipedia and AI: An > Evolving Synergy > > You are not allowed to post to this mailing list, and your message has > been automatically rejected. If you think that your messages are > being rejected in error, contact the mailing list owner at > wiki-research-l-owner(a)lists.wikimedia.org. > >

15 years, 11 months

SCooP: Final Call for Papers - EXTENDED DEADLINE May 23rd

by Christine Müller

Apologies for cross-postings. == Call for Contributions == 2nd Workshop on Scientific Communities of Practice (SCooP) on June 27th 2008 at Jacobs University Bremen. http://jem-thematic.net/seminar/scoop2008 == Overview == Communities of Practice (CoPs) group people from all around the globe around a common concern, a common set of problems, which is tackled by exchanging knowledge, ideas, and expertise. CoPs also exist in science, although scientific communities of practice are more heterogeneous than their corporate counterparts, as members come from a variety of backgrounds and disciplines. Yet, it is exactly for this interdisciplinarity that these groupings are valuable for a for deepening knowledge and learning. In this context, SCooP aims at joining people from different fields, such as mathematics, computer science, chemistry, physics, biology etc., who share a common interest -- Communities of Practice. The workshop thus wants to facilitate the exchange of experiences and implementations, and will in particular address questions such as: * What are scientific or educational practices in educational and scientific communities? * Can these practice be automatically detected/ collected/ or modeled? * What are implementations for CoPs? * Which features make these tools so attractive and how do they support (practices of) CoPs? The workshop welcomes contributions in the following formats. * Paper contributions (including position papers and research proposals): Max. 300 word abstract; paper submission * Demonstration and Presentations of systems, prototypes, and mock ups: 200-300 word abstract (presentation during the workshop, paper is optional) == Important dates == * _NEW_ submission deadline for abstracts: May 23rd (via email to c.mueller(a)jacobs-university.de ) * Submission of papers: May 30th * Notification of acceptance: June 6th * Camera ready copies due: June 20th (approximately) * Workshop in Bremen: June 27th == Registration and Accommodation == * Registration via http://jem-thematic.net/seminar/scoop2008 * Accommodation at http://jem-thematic.net/seminar/scoop2008 == Further Links == * SCooP Mailing List: http://lists.jacobs-university.de/mailman/listinfo/project-scoop * SCooP Interest Group at http://jem-thematic.net/sig/scoop The workshop is funded by the Joining Educational Mathematics Network http://jem-thematic.net/

15 years, 11 months

[Fwd: Fwd: RfC: Wikimedia Foundation Research Goals]

by Sue Gardner

Erik & I had a good meeting last week with the MacArthur Foundation, who reiterated their interest in funding research related to Wikipedia and the other Wikimedia projects. Their primary interest is in developing a better understanding of the Wikipedia audience (readers), but I believe they are potentially interested in research into the contributor community as well. Our research goals & interests are laid out here http://meta.wikimedia.org/wiki/Wikimedia_Foundation_Research_Goals It's a pretty full list, but not an exhaustive one. We'd encourage anyone who wants to conduct research into the Wikimedia projects to approach MacArthur for funding, and/or talk to us. Thanks, Sue -------- Original Message -------- Subject: [Wiki-research-l] Fwd: RfC: Wikimedia Foundation Research Goals Date: Mon, 7 Apr 2008 16:03:41 -0700 From: Erik Moeller <erik(a)wikimedia.org> Reply-To: Research into Wikimedia content and communities <wiki-research-l(a)lists.wikimedia.org> To: Research into Wikimedia content and communities <wiki-research-l(a)lists.wikimedia.org> References: <b80736c80804071602t14d55745v4c34be87e347cb25(a)mail.gmail.com> FYI ---------- Forwarded message ---------- From: Erik Moeller <erik(a)wikimedia.org> Date: Apr 7, 2008 4:02 PM Subject: RfC: Wikimedia Foundation Research Goals To: Wikimedia Foundation Mailing List <foundation-l(a)lists.wikimedia.org> Sue & I have drafted a set of research goals that the Wikimedia Foundation supports. The purpose of the document is to have something we can point researchers, universities, foundations, and other third parties to when they ask us: So, what kind of research are you interested in? Will you support/endorse my research proposal X? In most cases, we will not actively pursue these goals directly -- we'll just try to facilitate & endorse research by third parties. These research goals need to line up with our overall organizational goals to make sense, so we've tried to map research goals to organizational goals. In light of this constraint, please do feel free to make revisions, or to suggest changes on the discussion page: http://meta.wikimedia.org/wiki/Wikimedia_Foundation_Research_Goals It's still a draft looked at by only two people - so we do expect it to be incomplete. :-) (BTW - I'm aware that some chapters are pursuing a research agenda on their own: This is great, and these Foundation goals are in no way meant to be prescriptive for chapters.) -- Erik Möller Deputy Director, Wikimedia Foundation Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate -- Erik Möller Deputy Director, Wikimedia Foundation Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate _______________________________________________ Wiki-research-l mailing list Wiki-research-l(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l -- Sue Gardner Executive Director Wikimedia Foundation Your donations keep Wikipedia running! Support the Wikimedia Foundation today: http://wikimediafoundation.org/wiki/Donate

16 years

Copenhagen Wiki Wednesday!

by RuT

Hello All! Inspired by other cities, Copenhagen will also have its own version of WikiWednesday. Wiki Wednesday is a monthly event to get together wikiers, wikipedians, wiki-researchers, developpers, bloggers and anyone interested in wikis, social software and web2.0. When: 14th of May 2008, at 17:30 Where: Studenterhuset, Købmagergade 52, København K Distribute the invitation widely! Come and cheer '1st Life'! MVH, for the KWW organizing committee, Rut Jesus (PhD student on wikis, center for phil of nature and science studies, KU)

16 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Wiki-research-l May 2008