Dear Gabriel,
I cross-posted to dbpedia-developers list.
@DBpedia Team: although the text below might seem out of context, the
Wikitext-list is actually the one that overlaps the most with our main
topic: parsing Wiki syntax and templates. Please have a look at the
http://www.mediawiki.org/wiki/Parsoid project.
@Gabriel:
I think we should not include the markup in the <noinclude> section, but
on the doc page of the template, so it also helps normal editors to
better know, what the templates mean.
Back then we actually designed our approach to work in this way and also
attempted to add it to WP.
Of course, we were using a naive WP:BOLD approach, which got deleted:
http://en.wikipedia.org/w/index.php?title=Template:Infobox_person/doc&o…
But we nevertheless used template syntax, hoping that one day it will be
included in Wikipedia.
see
http://mappings.dbpedia.org/index.php/Mapping:Infobox_actor
{{TemplateMapping
| mapToClass = Actor
| mappings =
{{ PropertyMapping | templateProperty = name | ontologyProperty = foaf:name }}
{{ PropertyMapping | templateProperty = birth_place | ontologyProperty = birthPlace
}}
{{ DateIntervalMapping | templateProperty = yearsactive | startDateOntologyProperty =
activeYearsStartYear | endDateOntologyProperty = activeYearsEndYear }}
....
}}
It kind of helps to interpret the template and parse out the values
correctly. It seems that you try to do something similar. MAybe we can
just modify or change out approach, so it also fits your requirements.
I will be on holidays in the rest of March, so there will not be any
mails from me any more.
Sebastian
On 03/05/2012 06:43 PM, Gabriel Wicke wrote:
Awesome!! I
forwarded it to DBpedia developers. I think, the Parsoid
project might interest some of our people. How is it possible to join?
Or is it Wikimedia internal development? Is there a parsoid mailing
list?
You are very welcome to join-
http://www.mediawiki.org/wiki/Parsoid
has most of the information to get you started. We are using this
mailing list for discussions. You can also catch me in the #mediawiki
IRC channel as gwicke.
Can JS handle this? I read somewhere, that it was
several magnitudes
slower than other languages... Maybe this is not true for node-JS.
Competition between JS runtimes has improved performance a lot in the
last years. See for example the fun Computer Language Benchmarks Game:
http://shootout.alioth.debian.org/u32/which-programming-languages-are-faste…
It is still hard to beat C or C++ performance for memory-dominated
tasks of course.
All the data in our mappings wiki was created to
"mark up" Wikipedia
template parameters. So please try to reuse it. I think there are almost
200 active users in
http://mappings.dbpedia.org/ who have added extra
parsing information to thousands of templates in Wikipedia across 20
languages. You can download and reuse it or we can also add your
requirements to it.
Our primary requirement is marking up all top-level template arguments
(and generated content like image thumbnails) to enable editing in the
visual editor. The editor could however also benefit from type
information, so refining vocabulary information (and perhaps mapping
into an ontology) is also interesting to us. We should definitely
collaborate on this.
What do you think about embedding schema information (maybe RDFa
profiles?) into the noinclude section of a template page?
Gabriel
_______________________________________________
Wikitext-l mailing list
Wikitext-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects:
http://nlp2rdf.org ,
http://dbpedia.org
Homepage:
http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group:
http://aksw.org