Wikidata September 2016

wikidata@lists.wikimedia.org

63 participants
40 discussions

WSDM Cup 2017: Adobe Sponsors Awards for Vandalism Detection in Wikidata

by Martin Potthast

The subject says it all, but for quick reference, here's again the CfP. Cheers, Martin ------------------------------------------------------------------------------- WSDM Cup 2017: Call for Participation ------------------------------------------------------------------------------- We invite you to take part in the following shared tasks: Task 1. Vandalism Detection -- Given a Wikidata revision, is it damaging? This task is about detecting vandalism as well as all other kinds of damaging edits to Wikidata. In doing so, not only Wikidata's integrity is protected, but also that of all information systems making use of the knowledge base. Task 2. Triple Scoring -- Compute relevance scores for triples from type-like relations. For example, the triple "Johnny_Depp profession Actor" should get a high score, because acting is Depp's main profession, whereas "Quentin_Tarantino profession Actor" should get a low score, because Tarantino is more of a director than an actor. Such scores are a basic ingredient for ranking results in entity search. Learn more at http://www.wsdm-cup-2017.org Register now at https://goo.gl/forms/JaVQwFFewLtVFCik2 ------------------------------------------------------------------------------- Important Dates ------------------------------------------------------------------------------- now open Registration Sep 1, 2016 Training data release Oct 15, 2016 Early bird software submission Dec 8, 2016 Final software submission Dec 22, 2016 Announcement of evaluation results Jan 5, 2017 Paper submission Feb 6-10, 2017 Conference and WSDM Cup workshop All deadlines are 11:59 PM, anywhere on earth (AoE). ------------------------------------------------------------------------------- Special Announcements ------------------------------------------------------------------------------- Awards for best-performing submissions. Aobe Systems, Inc. sponsors the WSDM Cup with a total of $5000 in awards for the best-performing submissions. The winners of each task will receive an award of $1500, and the second and third runner-ups $750 and $250. Evaluation as a Service. For the sake of reproducability, we ask you to submit your software instead of just its run output. Software submissions allow for preserving your software in working condition, and for re-evaluating it as new datasets appear. To facilitate software submissions, we will make use of the cloud-based evaluation platform TIRA (www.tira.io). Open Source Proceedings. We encourage the open source release of your software. To maximize the impact of your software, we collect it at a central repository on GitHub: https://github.com/wsdm-cup-2017 Private repositories can be assigned to you at request during the competition. Benefits for early birds. Submitting your software or your notebook early, as well as registering early for the conference will be rewarded. Check out the specific benefits on our web page at http://www.wsdm-cup-2017.org

7 years, 8 months

SPARQL query increased timeouts?

by Markus Kroetzsch

Hi, SQID uses a somewhat challenging SPARQL query to refresh its statistical data for the current usage of classes [1]. This is done once per hour, with one retry after 60sec if the first attempt times out. In the past, timeouts have been common, but it usually worked after a while. Since a few days, however, the query always times out. In spite of the 48 attempts throughout each day, the query did not succeed once since 8/30/2016, 8:12:28 PM [2]. Possible explanations: * WDQS experiences more load now (every day, every hour). * The query got slower since for some reason the overall number of P31 statements increased in a sudden way (or for some reason crossed some threshold). * There have been technical changes to WDQS that reduce performance. I don't have statistics on the success rate of the problematic query in past weeks, so I cannot say if the timeout rate had increased before the current week. Does anybody have further information or obsevations that could help to clarify what is going on? We can rewrite our software to use simpler queries if this one fails now, but it seems like a step backwards. Best regards, Markus [1] Here is the query: SELECT ?cl ?clLabel ?c WHERE { { SELECT ?cl (count(*) as ?c) WHERE { ?i wdt:P31 ?cl } GROUP BY ?cl } SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . } } [2] https://tools.wmflabs.org/sqid/#/status -- Prof. Dr. Markus Kroetzsch Knowledge-Based Systems Group Faculty of Computer Science TU Dresden +49 351 463 38486 https://iccl.inf.tu-dresden.de/web/KBS/en

7 years, 8 months

Re: [Wikidata] List generation input

by Jan Dittrich

Hello Scott, Thanks for your input! If I understand this right, your concern is that there might be lists like "list of species" which are impossible (with several million entries) to have as single list? There is the splitting lists scenario, https://www.wikidata.org/wiki/Wikidata:List_generation_input/Scenario_C_spl… Does this go in the right direction for you (even though with far fewer items)? Does > cellular (neuronal) and nano (atomic?) refer to something like the possibility to create lists of lists? Kind Regards, Jan Message: 4 > Date: Thu, 1 Sep 2016 14:05:29 -0700 > From: Info WorldUniversity <info(a)worlduniversityandschool.org> > To: "Discussion list for the Wikidata project." > <wikidata(a)lists.wikimedia.org> > Subject: Re: [Wikidata] List generation input > Message-ID: > <CAEPEA68HBgsYsEeWcV9EoCfFQ-rr2TihNG3LNMhw7TMoGbFBLw@mail. > gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi Léa, Jan, Gerard, Markus, Wikidatans and All, > > I've read at different times that there are anywhere from 3 to 100 million > species (the latter would be a long list indeed!) and when you get probably > to different lists at the cellular (neuronal) and nano (atomic?) levels, > for example, the lists will probably get "way" longer :) > > Thank you, Scott >

7 years, 8 months

Introduce Wikidata API at Coding Da Vinci

by Léa Lacroix

Hello all, The Coding Da Vinci <https://codingdavinci.de/> GLAM hackathon is organized by volunteers in Hamburg, in September 17-18. They're looking for someone who could be able to present the Wikidata API (and probably the Query Service) to the attendees, people working in museums, libraries... and also IT. The transport and venue would be funded by WMDE under conditions. If you're interested or know someone who could be, please let me know as soon as possible! Thank you very much, -- Léa Lacroix Community Communication Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

7 years, 8 months

Weekly Summary #225

by Léa Lacroix

7 years, 8 months

Re: [Wikidata] List generation input

by Jan Dittrich

Hi Scott, Thanks for your suggestion! I'd love to have more good information on using Wikidata. I assume, though, that putting it to the Scenarios for *List generation input *maybe not the best place (although the name suggests input in general, and I am guilty of not choosing a more specific title!) The scenarios currently concern sharing the (real life) workflows of people who generate lists, so it may run counter the idea of suggesting/teaching workflows there (although a beginners might find them to be useful examples!) If you want to create such a help page, feel free to link them on the List generation Input page (as said above, the title might suggest such content). Kind Regards, Jan Message: 1 > Date: Wed, 31 Aug 2016 11:44:33 -0700 > From: Scott MacLeod <worlduniversityandschool(a)gmail.com> > To: "Discussion list for the Wikidata project." > <wikidata(a)lists.wikimedia.org> > Subject: Re: [Wikidata] List generation input > Message-ID: > <CADy6Cs_qZ6=sWKaCkNEZZ4LqatuK=sCb2TvaVBCOwc7a5=bnFw(a)mail.gm > ail.com> > Content-Type: text/plain; charset="utf-8" > > Hi Lea, Jan, Gerrard and All, > > In terms of "List Generation Input," would it make sense to add a basic > intro tutorial (in video perhaps - is there one even?) about how to create > a new large list in the first place in Wikipedia based on Wikidata / > Wikibase, as well as a possible sandbox for exploring this? (When I google > searched on "starting a long list in Wikipedia," I found this > https://en.wikipedia.org/wiki/Help:List - "This help page explains how to > create and edit lists on the English Wikipedia," which also doesn't mention > Wikidata). > > Say, for example, I wanted to create a complete list of all languages in > Wikipedia beginning with Glottolog's CC list here - > http://glottolog.org/glottolog/language - with 7,943 entries in languages > (to create wiki schools for open teaching and learning in those languages > (using WUaS's Subject Template - > http://worlduniversity.wikia.com/wiki/SUBJECT_TEMPLATE), how would I begin > to do this? (WUaS donated CC WUaS to Wikidata last autumn). (Similarly > with all species, beginning with a complete CC list, for example, and in > order to eventually co-develop, for example, an all-species' image > recognition application - hold your smartphone up to any species anywhere, > and identify it - and in any language), how would I begin to add such lists > to Wikipedia with Wikidata? > > Shall I add a scenario D to this page - > https://www.wikidata.org/wiki/Wikidata:List_generation_input - with these > questions above? > > Thank you, Scott > > worlduniversityandschool.org

7 years, 8 months

rdfs:label vs wikibase:label

by Ali Ismayilov

Dear all, Standard way of filtering labels and calling SERVICE returns different number of results. Is it normal? For example, I would like to get all places in Germany. If I filter language using rdfs:label I got 14346 results [1]. However, when I use SERVICE wikibase:label I got 37991 results [2]. Am I missing something? -- Best regards, Ali Ismayilov [1] - https://query.wikidata.org/#SELECT%20%3Fplace%20%3Flabel%20WHERE%20%0A%7B%0… [2] - https://query.wikidata.org/#SELECT%20%3Fplace%20%3FplaceLabel%20WHERE%20%0A…

7 years, 8 months

List generation input

by Léa Lacroix

Hello folks, The Wikidata development team is currently working on tools to improve *list creation on Wikipedia*, based on Wikidata data. In order to understand what could be useful for you and why, we suggest you* three examples of user scenarios <https://www.wikidata.org/wiki/Wikidata:List_generation_input>*, in which you could recognize some of your current uses: how do you currently edit some lists on Wikipedia, which tools or processes do you use, and what can be improved. You can answer some short questions and add comments on our assumptions on each related talk page. This input is very important to help us understand how you edit the lists on Wikipedia, and what tools could be useful for you. Thanks to all of you who will take a few minutes to answer our questions! Jan & Léa -- Léa Lacroix Community Communication Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

7 years, 8 months

Query fails with hy labels

by Tobias Schönberg

This query does not work with Armenian labels, but works with en. Is that a bug? https://query.wikidata.org/#%23%20Given%20names%20of%20all%20people%20born%…

7 years, 8 months

Item Label from ItemId

by Sumit Asthana

Hi, I've written a code to scrape Wikidata dump following Wikidata Toolkit examples. In processItemDocument, I have extracted the target entityId for the property 'instanceof' for the current item. However I'm unable to find a way to get the label of the target entity given that I have the entityId, but not the entityDocument? Help would be appreciated :) -Sumit Asthana, B.Tech Final Year, Dept. of CSE, IIT Patna

7 years, 8 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata September 2016