Wikidata April 2016

wikidata@lists.wikimedia.org

56 participants
49 discussions

question on claim-filtered search/dump and works on a Wikidata subset search engine

by Maxime Lathuilière

8 years

by Lydia Pintscher

Hey folks :) We've been doing a lot of groundwork over the past months in order to support new entity types. We need this for Wikimedia Commons support among other things. Today we have created the very first media info entity. This is the equivalent of an item but for storing structured data about media files. It's still ugly and unusable but it's a major step on the way to supporting structured data on Commons and I wanted to share that with you. Lots of work ahead still but we're making progress. Next step: public, ugly, not very functional demo system. \o/ Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

8 years

Query service issue

by Stas Malyshev

Hi! During the data reload for geospatial service enabling, we have discovered a problem with Wikidata dumps (https://phabricator.wikimedia.org/T133924). The effect of this problem is that some items are missing from the dump. Dumps starting 20160418 are affected, previous ones seem to be fine. The immediate fix for this would be to reload the data from a correct dump (20160411) and re-sync the data since then. Unfortunately, this may take some time (a day or so for reload, and another day or so for resync), and until then you'll see some missing data on query.wikidata.org. Please be patient until then. I apologize for the inconvenience caused, and will continue to research the cause of the missing data and then fix it. I'll update the ticket when we have new info. Thanks, -- Stas Malyshev smalyshev(a)wikimedia.org

8 years

arbitrary access for Wikimedia Commons is coming on the 26th of April

by Lydia Pintscher

Hey folks :) A while ago we asked for testing of the arbitrary access feature on the Commons test system: https://lists.wikimedia.org/pipermail/wikidata/2016-March/008447.html There we no major issues I am aware of so we're moving ahead and will enable it on the 26th of April. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

8 years

item / claim usage tracking

by Benjamin Good

I'll start with the simple question than give the longer context. Is there any way to know how many times an item or a claim appears in the results of a query to query.wikidata.org ? Are there any other ways to quantify query/application usage of specific wikidata content? Background. The gene wiki people recently attended a conference on 'biocuration' (the construction and maintenance of biological databases) where we gave multiple wikidata-related presentations. The community there generally had a very positive reaction to what we have been doing but many were concerned about attribution. They wanted to know that when data was imported into wikidata from their resources (e.g. the Gene Ontology), that there was some way to ensure that the world knew where it came from so that the authors could get appropriate credit (which translates into grant money which translates into their jobs). We explained the reference model to them, which helped, but still they are concerned. The most important consequence of moving data into wikidata is that it can get used - sometimes a lot! (e.g. when displayed on Wikipedia articles). If we could quantify usage for data providers, it would really help them make the argument to their funding sources that contributing to wikidata increases their impact. If we can get that across, it would help bring more people, more high quality data, and more funding into the wikidata fold. thoughts? -Ben

8 years

Deployment test server for Wikidata Query Service

by Stas Malyshev

Hi! I've created a deployment test server for Wikidata Query Service: https://wdqs.wmflabs.org/. This server is a labs copy of query.wikidata.org and is deployed hourly from main deployed repo. Previously, I used the test/development server for this but due to obvious stability requirement conflicts between test server (which I should be able to shut down, break and mess with repeatedly) and deployment server (which should be in stable state) I split those functions. Please tell me if you notice anything wrong with it. Thanks, -- Stas Malyshev smalyshev(a)wikimedia.org

8 years

Re: [Wikidata] Wikidata to search chemical compound names

by Sebastian Burgstaller

A way to achieve this could be to fetch all labels and aliases for all chemical compounds in one query and store them locally in your web application. This certainly is only feasible if the number of compounds does not get to big in Wikdiata. Currently, the query takes ~ 6 sec. PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?cmpnd ?label WHERE { {?cmpnd wdt:P279 wd:Q11173 .} UNION {?cmpnd wdt:P31 wd:Q11173 .} ?cmpnd rdfs:label ?label . } Best, Sebastian (sebotic) > Hi all! I'm building a web application where users can search for > protein/compound/etc. names and view their 3D structure using WebGL. I'm > currently using the PubChem (chemical compounds database) API to provide > some autocomplete data, but I found that Wikidata also has many chemical > compound names with PubChem indices! The most important reason to try to > autocomplete compound names via Wikidata is to allow users to search in > different languages. PubChem generally only provides English names. > However, I could not find a suitable API for this. I tried building a > SPARQL query but that quickly became very slow. I could not find an option > to limit full-text searches to a specific subclass in the search API > provided by:https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities. > Do you have any ideas? The only option I see for now is iterating each > response entity and looking up their subclass of/instance of property. >

8 years

weekly summary #206

by Lydia Pintscher

Hey folks :) Here's your summary of what's been happening around Wikidata over the past week. Enjoy! Events <https://www.wikidata.org/wiki/Wikidata:Events>/Press/Blogs <https://www.wikidata.org/wiki/Wikidata:Press_coverage> - Writing a bachelor's thesis at Wikimedia Deutschland e.V. <http://blog.wikimedia.de/2016/04/20/writing-a-bachelors-thesis-at-wikimedia…> - Europeana Art History Challenge begins <https://blog.wikimedia.org/2016/04/19/europeana-art-history-challenge> - TED is partnering with the Wikimedia community to add “ideas worth spreading” to Wikimedia projects <https://blog.wikimedia.org/2016/04/22/ted-wikimedia-collaboration> - Wikidata meets world literature <http://weltliteratur.net/Wikidata-Meets-World-Literature/> - Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes (conference poster) <https://figshare.com/articles/Force11Putman_48_44_pdf/3187411> - Past: Wikimedia Conference <https://meta.wikimedia.org/wiki/Wikimedia_Conference_2016> Other Noteworthy Stuff - SQUID: a new class and property browser for Wikidata <https://tools.wmflabs.org/sqid> (background info <https://lists.wikimedia.org/pipermail/wikidata/2016-April/008554.html>) - Wikidata Graph Builder <https://angryloki.github.io/wikidata-graph-builder/> now can visualize graphs with different node sizes. See for instance, types of artists with number of people, occupied with each subclass item <https://angryloki.github.io/wikidata-graph-builder/?property=P279&item=Q483…> - List of items that have English, German, and French sitselinks, but no statements <https://tools.wmflabs.org/pagepile/api.php?id=2871&action=get_data> - List of articles on English Wikipedia with a video where the corresponding item has no video <http://petscan.wmflabs.org/?psid=5007> - Listeria bot (the one generating lists on Wikipedia based on Wikidata data) will now begin auto-replacing simple WDQ queries with SPARQL ones - Magnus is looking for a designer to help with a logo for PetScan <https://twitter.com/MagnusManske/status/722835003365789700> - Arbitrary access for Wikimedia Commons is coming as planned on the 26th - Wikidata surpassed English Wikipedia in items/articles using files from Wikimedia Commons Did you know? - Newest properties <https://www.wikidata.org/wiki/Special:ListProperties>: Danish listed buildings case ID <https://www.wikidata.org/wiki/Property:P2783>, Models.com client ID <https://www.wikidata.org/wiki/Property:P2782>, Race time <https://www.wikidata.org/wiki/Property:P2781>, IAT diver ID <https://www.wikidata.org/wiki/Property:P2780>, IAT weightlifter ID <https://www.wikidata.org/wiki/Property:P2779>, IAT triathlete ID <https://www.wikidata.org/wiki/Property:P2778>, FIS snowboarder ID <https://www.wikidata.org/wiki/Property:P2777>, FIS Nordic combined skier ID <https://www.wikidata.org/wiki/Property:P2776>, FIS ski jumper ID <https://www.wikidata.org/wiki/Property:P2775>, FIS freestyle skier ID <https://www.wikidata.org/wiki/Property:P2774>, FIS cross-country skier ID <https://www.wikidata.org/wiki/Property:P2773>, FIS alpine skier ID <https://www.wikidata.org/wiki/Property:P2772>, D-U-N-S <https://www.wikidata.org/wiki/Property:P2771>, source of income <https://www.wikidata.org/wiki/Property:P2770>, budget <https://www.wikidata.org/wiki/Property:P2769>, BNE journal ID <https://www.wikidata.org/wiki/Property:P2768>, JudoInside.com ID <https://www.wikidata.org/wiki/Property:P2767>, ISO 4063 process number <https://www.wikidata.org/wiki/Property:P2766>, blue-style.com ID <https://www.wikidata.org/wiki/Property:P2765>, Wrestlingdata person id <https://www.wikidata.org/wiki/Property:P2764>, Danish protected area ID <https://www.wikidata.org/wiki/Property:P2763>, Skyscraper Center building complex ID <https://www.wikidata.org/wiki/Property:P2762>, Research Papers in Economics Series handle <https://www.wikidata.org/wiki/Property:P2761>, NUTTAB Food Identifier <https://www.wikidata.org/wiki/Property:P2760>, AUSNUT Food Identifier <https://www.wikidata.org/wiki/Property:P2759>, CNC film rating <https://www.wikidata.org/wiki/Property:P2758>, EIRIN film rating <https://www.wikidata.org/wiki/Property:P2756>, exploitation visa number <https://www.wikidata.org/wiki/Property:P2755>, production date <https://www.wikidata.org/wiki/Property:P2754>, Dictionary of Canadian Biography ID <https://www.wikidata.org/wiki/Property:P2753>, New Zealand Organisms Register ID <https://www.wikidata.org/wiki/Property:P2752>, Roller Coaster Database ID <https://www.wikidata.org/wiki/Property:P2751>, Photographers' Identities Catalog ID <https://www.wikidata.org/wiki/Property:P2750>, PRONOM software identifier <https://www.wikidata.org/wiki/Property:P2749>, PRONOM file format identifier <https://www.wikidata.org/wiki/Property:P2748>, Filmiroda rating <https://www.wikidata.org/wiki/Property:P2747> - Newest WikiProjects <https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:WikiProjects>: Olympics <https://www.wikidata.org/wiki/Wikidata:WikiProject_Olympics> - Query examples: songs with longest melody <https://query.wikidata.org/#SELECT%20%3Fsong%20%3FsongLabel%20%3Fcode%0AWHE…> (source <https://twitter.com/WikidataFacts/status/722886802797801472>), subjects with most art dedicated to them <https://query.wikidata.org/#SELECT%20%3Fdedicatee%20%28SAMPLE%28%3Fdedicate…> (source <https://twitter.com/WikidataFacts/status/723512502512922624>), versions of The Scream <https://query.wikidata.org/#SELECT%0A%20%20%3Fitem%0A%20%20%28SAMPLE%20%28%…> (source <https://twitter.com/PoulpyWP/status/723576880151048192>) Development - There will be some maintenance on the query service in a few hours <https://lists.wikimedia.org/pipermail/wikidata/2016-April/008586.html> - Redesigned the rank selector as well as snak type selector icons ( phab:T129033#2217402 <https://phabricator.wikimedia.org/T129033#2217402>, gerrit:283968 <https://gerrit.wikimedia.org/r/283968>) - The snak type selector icon also has a tooltip now (gerrit:283945 <https://gerrit.wikimedia.org/r/283945>) - The term table header row as well as language column will now be marked as <th> instead of <td> (gerrit:283395 <https://gerrit.wikimedia.org/r/283395>) - Fixed an issue with newline characters in quantity values (phab:T110728 <https://phabricator.wikimedia.org/T110728>) - Fixed the month-precision time parser ignoring the minus in "January -150" (phab:T132441 <https://phabricator.wikimedia.org/T132441>) - Continued working on diffing and patching support for new entity types (phab:T132442 <https://phabricator.wikimedia.org/T132442>) - Continued working on support for creating entities of new types as part of the work for Wikimedia Commons support (phab:T132964 <https://phabricator.wikimedia.org/T132964>) - Rolled out a little query explanation field to query.wikidata.org to help you better understand a query if you don't understand SPARQL well - Fixed a bug where the map wouldn't show in the visualization of a query result (phabricator:T132669 <https://phabricator.wikimedia.org/T132669>) - Fixed issues with the duplicate references gadget (phabricator:T131920 <https://phabricator.wikimedia.org/T131920>) - Added basic support for more languages to the query service ( phabricator:T132756 <https://phabricator.wikimedia.org/T132756>) You can see all open tickets related to Wikidata here <https://phabricator.wikimedia.org/maniphest/query/4RotIcw5oINo/#R>. Monthly Tasks - Hack on one of these <https://phabricator.wikimedia.org/maniphest/query/R8GRzX1eH0tb/#R>. - Help develop the next summary here! <https://www.wikidata.org/wiki/Wikidata:Status_updates/Next> - Contribute to a Showcase item <https://www.wikidata.org/wiki/Wikidata:Showcase_items> - Help translate <https://www.wikidata.org/wiki/Special:LanguageStats> or proofread pages in your own language! - Add labels, in your own language(s), for the new properties listed above. Anything to add? Please share! :) Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

8 years

Maintenance of Wikidata Query Service servers

by Guillaume Lederrey

Hello! To enable Geosearch [1] on WDQS, we need to do a full dataload to re-index all data with Geosearch extension. We will use this opportunity to also do a full reinstall of one of our WDQS server to increase available disk space. During this data load / reinstall, we will be running on a single server. We can expect slow down in response time. This operation will start Tuesday April 26 around 8am UTC. Data load for both servers is expected to take multiple days. You can follow the progress on the corresponding Phabricator task [2]. Thank you for your patience! Guillaume [1] https://phabricator.wikimedia.org/T123565 [2] https://phabricator.wikimedia.org/T133566 -- Guillaume Lederrey Operations Engineer, Discovery Wikimedia Foundation

8 years

Wikidata to search chemical compound names

by Herman Bergwerf

Hi all! I'm building a web application where users can search for protein/compound/etc. names and view their 3D structure using WebGL. I'm currently using the PubChem (chemical compounds database) API to provide some autocomplete data, but I found that Wikidata also has many chemical compound names with PubChem indices! The most important reason to try to autocomplete compound names via Wikidata is to allow users to search in different languages. PubChem generally only provides English names. However, I could not find a suitable API for this. I tried building a SPARQL query but that quickly became very slow. I could not find an option to limit full-text searches to a specific subclass in the search API provided by: https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities. Do you have any ideas? The only option I see for now is iterating each response entity and looking up their subclass of/instance of property.

8 years

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata April 2016