Jochen Magnus wrote:
Brion,
I think that not only en dumps failed. I found some errors in the most
recent de dump (20051017_pages_articles.xml): As far as I inspected the
xml file manually, I saw several <title>'s which do not belong to the
comtent. I.e.:
<title>Vitruvius</title> contains an article about the planet Venus
<title>Indianische Flöte</title> contains the history of Poland
<title>Marlon Brando</title> contains Madonna (sic!)
and so on
Hmm, that shouldn't happen. I'll have to debug it. Sigh.
Besides, there a many articles which exceptionally
length in the dump,
which are not belonging into Namespace #0.
?
-- brion vibber (brion @
pobox.com)