Re: [Wikitech-l] A metadata API module for commons

4 Sep 2013

On 09/04/2013 09:59 AM, Brian Wolff wrote:
...
   This [1] looks
quite acrobatic indeed. Can’t we make better use of the
 machine-readable markings provided by templates?
 <https://commons.wikimedia.org/wiki/Commons:Machine-readable_data>

 [1] https://gerrit.wikimedia.org/r/#/c/80403/4/CommonsMetadata_body.php

 It is using the machine readable data from that page. (Although its
 debatable how much "Look for a <td> with this id, and then look at the
 contents of the next sibling <td> you encounter is").

 I'm somewhat of a newb though with extracting microformat style
 metadata, so its quite possible there is a better way, or some higher
 level parsing library I could use (Something like xpath maybe,
 although its not really xml I'm looking at). 
Parsoid might be able to help you with access to template parameters
along with the fully expanded HTML that was produced from them. See [1].

We are going to work on page metadata storage as well, see [2] and [3].
Maybe our storage work could eventually also provide a backend for you.

Gabriel

[1]:
https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec#Template_content
[2]: https://bugzilla.wikimedia.org/show_bug.cgi?id=53508
[3]: https://bugzilla.wikimedia.org/show_bug.cgi?id=49143

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] A metadata API module for commons