regarding the forms: currently that is not part of the dataset yet. And
unfortunatley its not very easy to add it. I think it even would require
some enhancement to the extractor (not just the config). But its on my
todo list...
However such "boxes" of word forms are probably easier to extract with
the default DPpedia infobox extractor. Maybe the DBpedia community could
help with that. The biggest problem there would be to determine the
right "context" (i.e. the subject URI)...
i crossposted this to DBpedia, so they can reply
Regards,
Jonas
Am Freitag, den 01.06.2012, 12:08 +0200 schrieb Lars Aronsson:
On 2012-05-31 12:42, Gerd Zechmeister wrote:
I'd like to extract German noun forms (Kasus
and Numerus) but didn't find this data in the provided dumps.
Example:
http://de.wiktionary.org/wiki/Haus
I need the data from the box:
Kasus Singular Plural
Nominativ das Haus die Häuser
This is provided in the wiki template call
{{Deutsch Substantiv Übersicht
|...
|Nominativ Singular=das Haus
|Nominativ Plural=die Häuser
...
That you find in this XML dump (only 50 MB compressed),
http://dumps.wikimedia.org/dewiktionary/20120526/dewiktionary-20120526-page…
An old Perl script for parsing the XML dumps is found here,
http://meta.wikimedia.org/wiki/User:LA2/Extraktor