I'm not a fan of "Display". While it is used in normal interface it's
not necessarily the actual title that will be displayed inside the title
header. Remember that there is a third form if the wiki is supporting
the upgraded DISPLAYTITLE, which by default will be enabled ^_^ in a
semi-neutered state (lol) where it functions like the standard
DISPLAYTITLE for backwards compatibility. But over top of that, there
are many wiki which are likely to enable an option or an extension which
allows less strict display titles, as well as some purely extension
driven ones.
So, I think Display should be reserved for what goes inside of the
header, rather than the un-normalized name of the title.
So potential words:
(Normalized form for inside the database) Key, DB Key, Unique, Normalized
(Non-Normalized form for interface use) Real, Text, Interface, Display,
UI, UnNormalized
(Format generated by extensions or other things for use in the title
bar) Display, Generated
No, @ should never be banned from the namespace. It's ok to make it
invalid for creation and use, however it's not ok to kill any User
namespace title using it.
Remember that some things like the idea of Transwiki imports appending
the wiki they came from after an @. Those still link to names inside the
user namespace. It would be best to keep @ valid so that those can be
created with information or such on that user from the other wiki.
Hmmm... well, there are two things to actually consider with the format
of the name.
Firstly, is that most of the stuff inside of MediaWiki dealing with
titles, and many of them the ones which deal with broken links as well,
are likely to create Title objects using factory functions which make
use of the key format, NOT the real format. And most of them do that for
good reasons which should not be altered. So, a high percentage of the
time you have a Title object generated by MediaWiki itself, which will
have no concept or input of what a real title is.
Secondly, there are many broken links which are created in special
pages. Ones such as Wantedpages, Broken redirects, etc... None of these
will ever have a real format inside the database, and there is no title
attribute to attempt use of.
Thirdly has to do with the editing (Honestly I haven't been thinking
much about that, more focused on [[Special:Movepage]]). There are a few
issues with directly using the title made by the url.
* External sites may actually be using the format which search engines
typically have used, but may be in use in other systems as well. In this
way, we may actually have titles such as [[Main+Page]], which will
insert a literal + inside the title, which we don't actually want.
* While we can alter the way that links are generated for the current
site, we cannot guarantee anything to do with another wiki. While we do
have the user case stuff for interwiki which allows for keeping the case
of the first letter. There's no guarantee that another wiki won't use a
link like [[Wikipedia:Some Page]], and as a result will link to
http://en.wikipedia.org/wiki/Some_Page and that view page would then
have the _ inside of it, and a user who hits edit would then end up
creating a page with an _ inside the title.
* Don't forget us users who don't use the search box, or preview link
methods, but instead use the address bar altering method. It's highly
likely there are editors who will edit the address bar and use a _ to
get where they want to go for page editing. And it's not good for this
to be reflected inside of the title. And there's no guarantee on what
the browser will do when we type actual characters inside the address
bar, and we shouldn't need to memorize percent encodings for characters.
Actually, I'm thinking the best method may actually be the addition of a
new input to the edit page for that edit issue. Basically a Title input
would appear above the textbox (Where the section title is currently
located) when editing a new page. This would by default contain the back
converted title (To avoid any possible issue, including a user using a
title with no first case capital when they should be using it), and the
user could edit it to reformat the title. While we could make it so that
the the title there is required to normalize back to the normal title.
Why not kill two birds with one stone? Instead of that, use that input
as the actual title. This will even kill the normal confusion that a new
editor encounters when they don't know how to create a new page. All
you'll need to do is go to any new page with an &action=edit, even using
something like
http://en.wikipedia.org/w/?action=edit or use
http://en.wikipedia.org/w/?action=create to clear it up, type in the
title of the page into the input, and save it. Instead of mucking with
other stuff, and extensions like the Inputbox. Of course, if that page
already exists, then you'll simply get an error telling you it already
exists, and you should either edit that page or find a new title.
Now as for the redlinks and getting a real form through the url. I'd
propose a secondary parameter, something like &titlesuggest= or
something. Which a redlink would append to keep the formatting of the
current title. As well as extensions like Inputbox could do.
As for the current existing pages. We would probably leave that out and
use Special:Movepage to do actual movement. It's possible that in the
future someone could rewrite the edit system and move system to allow
for a setup where you can both move, and edit the page text at the same
time. However, that is a rewrite in an area which we probably should not
attempt inside the scope of the current title rewrite. That can actually
be done on it's own later without requiring any work here.
----
I should also make a note on caching, and what effect that has on the
title. I'm actually using the LinkCache to generate the Real name from
the database. Why? Title.php never, ever uses a database query inside of
a get function to get information on a title. The only database queries
inside of there are part of things like the factory constructors dealing
with page ids.
The only other getter which needs database info, is getArticleId. And
that makes a query to the LinkCache, which does the database query there
on it's own, and returns that, as well as putting the link into the
cache to avoid any new queries.
Because of that, I went for making use of the LinkCache for the getting
of the real title. Of course there are a few issues.
Firstly, because this is being stored in the cache, we should NEVER
store a user inputed real title. For this reason, before the LinkCache
stores the title in it's cache, it actually resets the real title using
that database query, and if it doesn't find it, then it backconverts it.
So it's not a good idea to use user supplied titles here.
So that has two side effects.
Firstly, if we were to make it so that the LinkCache just didn't set the
real title when it doesn't find one... We are likely to end up with a
Wiki error resulting from getText not normalizing to getDBkey at some
point. Or even worse, there could be a small possibility of an infinite
loop where something which needs that real title may query again until
it gets it, which would never happen. Another affect could be the fact
that since it does not have a real title stored, the LinkCache would be
queried for a new real title every time you call getText or something
else which depends on it. This would be a heavy database burden
resulting from LinkCache not setting a fallback title.
Secondly, just as another result, but if anyone uses getArticleId on
your title object, there is the possibility that the real title will be
reset. (I've made a warning of this inside the setter, that setter is
meant strictly for temporary use where you know what is happening to the
title. Primarily it is only used by the LinkCache firstly to actually
set the title (Since it isn't from the same object and it's bad practice
to externally edit a member variable, that could change names, and also
because it would generate PHP errors if we later re-factored things to
use actual private variables.), and also will be used when moving a page
to set what new real title to move to)
From my detailed look at Title.php there is little other way to do
this, which does not have a serious effect on the database, or bad
coding which will result in a lot of bugs.
~Daniel Friesen(Dantman) of:
-The Gaiapedia (
http://gaia.wikia.com)
-Wikia ACG on
Wikia.com (
http://wikia.com/wiki/Wikia_ACG)
-and
Wiki-Tools.com (
http://wiki-tools.com)
Simetrical wrote:
On Wed, Mar 12, 2008 at 11:22 AM, DanTMan
<dan_the_man(a)telus.net> wrote:
getName will be depreciated...
To go with the whole key/real namescheme I've been going with in
Title.php a new getRealName function will get the name to use for
interface display.
And to match that, getKeyName will get the name for use in uniqueness
checking and comparison, and getName will be aliased to it.
Are "RealName" and "KeyName" the best terms to use? We already use
"Text" and "DBKey" for titles, but I recall that confused me
somewhat
for a while. I would probably have done "DisplayName" and
"NormalizedName", but that might not be ideal either. We may as well
think about this now instead of being stuck with weird names forever.
^_^ Actually about your note on User and Title
normalization not being
the same. There is no real reason for them not to be (With the exception
of the stuff that we stick in functions like isValidName)...
Why's that? A little bonus I already theorized but never mentioned (I'm
good at grasping a lot of theory and wrapping my mind around how things
work and are supposed to, so I get a lot of them)
Because of the new extensible normalization, and how all the username
stuff relies on getDBkey and directly uses getText for displaying the
username, there is a little bonus.
If you go and extend the normalization of Titles specifically for the
User: namespace (remember that because of the way it's setup, you can
now create per-namespace normalization), the normalization of Usernames
will be directly affected by it (Which is kinda why I needed to alter
User.php because of that login bug).
Oh, that's very neat. It preserves a one-to-one correspondence
between usernames and User-namespace titles -- almost. Are you going
to do stuff like ban '@' and other things not allowed in usernames
from the namespace? That would make it a perfect bijection between
User pages and user names-plus-IP addresses.
Btw: I have a function inside of the
normalizer.
TitleNormalizer::backconvert( $title ); basically it does the normal
replacing of underscores with spaces. The point of it is for when we
don't have a page_real stored in the database (ie: nonexistant page),
then backconvert will be used to create a temporary title for displaying
while the page doesn't exist. Of course, there is a hook inside of it
which lets extensions override it in case they do something like
changing the ' ' to '_' normalization to ' ' to '-' for
some reason.
Hmm, I see. When would this be a concern? Shouldn't the page_real be
generated from the URL? I guess not exactly, if link targets are
normalized. I'm thinking if the user types, I dunno, "str_repeat"
into the search box, they should get links asking them to edit
"str_repeat", not "str repeat" or any other variant. The same
should
apply to ordinary wikilinks, ideally -- but on the other hand,
non-broken wikilinks should still point to prettified locations. So I
guess this would require [[has space]] to translate to
?title=Has_space (or whatever normalized form) if it exists, but
?title=has%20space&action=edit if it doesn't. Which isn't perfect.
But I don't see any other way to achieve the effect.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l