On 2012-05-31 12:42, Gerd Zechmeister wrote:
I'd like to extract German noun forms (Kasus and
Numerus) but didn't find this data in the provided dumps.
Example:
http://de.wiktionary.org/wiki/Haus
I need the data from the box:
Kasus Singular Plural
Nominativ das Haus die Häuser
This is provided in the wiki template call
{{Deutsch Substantiv Übersicht
|...
|Nominativ Singular=das Haus
|Nominativ Plural=die Häuser
...
That you find in this XML dump (only 50 MB compressed),
http://dumps.wikimedia.org/dewiktionary/20120526/dewiktionary-20120526-page…
An old Perl script for parsing the XML dumps is found here,
http://meta.wikimedia.org/wiki/User:LA2/Extraktor
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se