On Wed, 2 Mar 2005 20:14:25 +0000, Rowan Collins
<rowan.collins(a)gmail.com> wrote:
On Wed, 2 Mar 2005 13:35:06 -0600, Richard Holton
<richholton(a)gmail.com> wrote:
In looking through the MediaWiki schema (both
current and new), I've
noticed that page titles are given a max length of 255 chars. However,
it seems that in some cases, this title includes the namespace, and in
others it does not -- namespace is stored as an Int or is implied by
context (eg, categorylinks, imagelinks).
Hm, I see what you mean - there aren't that many places where it's a
problem, but certainly 'brokenlinks' has the namespace as part of the
[destination] title. So it seems an article could have 255 characters
+ a namespace (because the namespace isn't considered part of the
title) and not fit in brokenlinks (because that just stores the text
of the link, rather than a namespace and title).
There's been talk of merging the various links tables to all be
id->name (rather than some being id->id), because the text of the link
doesn't change, but the article it refers to might. This problem could
be addressed by anyone implementing that.
The disadvantage of using {namespace_as_int, title_as_text} for link
targets is that this doesn't reflect how they're entered: [[Foo:Bar]]
could change in meaning from {0, "Foo:Bar"} to {20, "Bar"} if a
custom
"Foo" namespace was created; the two forms could not, however,
co-exist. This suggests to me that it would be better to just make the
link_to field wider than page_title (i.e. a width of 255 + a constant
MAX_NAMESPACE_LENGTH), and retain the current practice of storing the
destination as one string.
I notice that in the new schema, the 'page' table uses the
{namespace_as_int, title_as_text} form, and it doesn't save the
namespace within the title. (Was that true of the old schema as well?)
I don't want to second-guess the new schema. It does seem that the
link tables should use the same method of identifying pages as the
'page' table does.
For 'categorylinks', having the namespace in the index would allow
fast separation by namespace.
-Rich Holton
en.wikipedia:User:Rholton