David Campeau wrote:
How does one go to know what namespace is an article
in the new dump scheme.
xml looks something like this:
[snip]
are we supposed to load the *ns0 file to know what
article are in ns0? what
about the other namespace?
The list of namespaces is right there in the dump:
<mediawiki
xmlns="http://www.mediawiki.org/xml/export-0.3/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.3/
http://www.mediawiki.org/xml/export-0.3.xsd" version="0.3"
xml:lang="aa">
<siteinfo>
<sitename>Wikipedia</sitename>
<base>http://aa.wikipedia.org/wiki/Main_Page</base>
<generator>MediaWiki 1.5beta3</generator>
<case>first-letter</case>
<namespaces>
<namespace key="-2">Media</namespace>
<namespace key="-1">Special</namespace>
<namespace key="0" />
<namespace key="1">Talk</namespace>
<namespace key="2">User</namespace>
<namespace key="3">User talk</namespace>
<namespace key="4">Wikipedia</namespace>
<namespace key="5">Wikipedia talk</namespace>
<namespace key="6">Image</namespace>
<namespace key="7">Image talk</namespace>
<namespace key="8">MediaWiki</namespace>
<namespace key="9">MediaWiki talk</namespace>
<namespace key="10">Template</namespace>
<namespace key="11">Template talk</namespace>
<namespace key="12">Help</namespace>
<namespace key="13">Help talk</namespace>
<namespace key="14">Category</namespace>
<namespace key="15">Category talk</namespace>
</namespaces>
</siteinfo>
<page>
<title>MediaWiki:1movedto2</title>
<id>1</id>
...
-- brion vibber (brion @
pobox.com)