"Platonides" <Platonides(a)gmail.com> wrote in
message news:efk2vt$1gk$1@sea.gmane.org...
"Mark Clements" wrote:
As a separate, but related question, why is the
namespace not given as
part
of the page information?
e.g.
<title>Help:Contents</title>
<namespace>12</namespace>
<pagetitle>Contents</pagetitle>
Surely this would be more useful when it comes to wider application?
- Mark Clements (HappyDog)
I'd add it there as <title ns="12">Help:Contents</title>
(undefined
parameter meaning 'old xml version', not main namespace)
Giving title, namespace and pagetitle is redundant and should be avoided.
It
can be several Mb for uncompressed dumps.
That's a pretty good solution, although one of the issues is that the title
includes the namespace, which needs to be removed to get the actual page
title. I feel that the <page> section should be complete in and of itself,
without requiring the header section mapping namespace names to ids. Without
knowing the mappings (ns to ns-title) that are present in the header, you
cannot interpret the title unambiguosly, for example <title ns="0">Star
Trek: The Next Generation</title> relies on the parser knowing that ns-0 is
not called 'Star Trek' in order to be interpreted properly.
How about <title ns="12"
ns-title="Help">Contents</title>?
- Mark Clements (HappyDog)