Re: [Wikitech-l] Case insensitive links (not just titles).

6 Mar 2008

Ok, B it is. I'll add another entry to updaters.inc when I get home and 
start by first converting all uses of getText in User.php to getDBkey.
After the actual title stuff is built, we can track down all the places 
which use a displayable version of the name and make them use the 
displayname instead.

On another note, I guess this is my official statement on this part, but 
I intend to create a new class for the normalization of titles.
The TitleNormalizer class.

It acts as an instance, the primary purpose of it is for use of it's 
normalize function. It's constructed with a default set of sequence 
groups and sequence passes.
A few notes on that:
- Because of how it sequentially goes through things it has a nicely 
defined order, to add another sequence inside of an area a new group can 
even be inserted to group sequences of another type.
- The reason that the normalizer is used as an Instance, and not used 
statically is for optimum extensibility. There may be cases where just 
defining an extra sequence or two, or removing some won't be enough to 
make a change that you want to make. To facilitate the larger 
alterations to normalization someone can subclass the TitleNormalizer 
with a new class which includes their major normalizations, and use a 
Hook (Probably 'TitleNormalizerClass' or 'TitleNormalizerClassname'), to 
have MediaWiki instantiate a different type of class.

Also another important note. Currently secureAndSplit includes the 
trimming of whitespace as part of it's task before splitting interwiki 
and namespaces out. For various reasons I will be changing that order.
Nothing will be trimmed from the title before those are split out, the 
prefix splitter will be responsible for temporarily trimming whitespace 
and other stuff out of the split text before trying to find out what the 
prefix is. The actual trimming of whitespace will only happen after 
that, and also only after the fragment is extracted to, when we know we 
are actually working on the title portion only.
The current set of passes is actually quite hacky, as it basically trims 
whitespace, splits interwiki, re-trims whitespace, splits fragment, then 
re-trims whitespace again just to make sure that the actual title gets 
it's whitespace trimmed. And note that all three of those are meant for 
trimming the title, not the prefix or fragment, because I know at least, 
that the regex used to grab the prefix is specifically coded to ignore 
extra whitespace in the namespace/interwiki in the first place. Actually 
on that note, it doesn't look like there is much reason for the use of 
the regex. So to cut down on that, I'm going to try using normal string 
functions to pull out the prefixes and trim them off. A strpos, substr, 
and trim set together is much quicker than a full blown regex pattern match.

~Daniel Friesen(Dantman) of:
-The Gaiapedia (http://gaia.wikia.com)
-Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
-and Wiki-Tools.com (http://wiki-tools.com)

Simetrical wrote:
...
  On Thu, Mar 6, 2008 at 2:43 AM, DanTMan
&lt;dan_the_man(a)telus.net&gt; wrote:

   So, we have two options:
  A) Hack up User.php to use getDBkey and replaces _'s with spaces instead
  of getText.

 In particular, of course, using some nice User method that hides the
 ugly conversion in one place.

   B) Make use of getDBkey for identification of
the user and have the
  update script refactor the users table to use underscores like it should
  instead of spaces.

 The idea of having separate normalized/display names makes as much
 sense for users as for titles, certainly.  This seems like the more
 logical option.  It's not like we aren't going to have be doing
 rebuilding and repopulating of the page table to do this anyway, so
 why not the user table too?

 _______________________________________________
 Wikitech-l mailing list
 Wikitech-l(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Case insensitive links (not just titles).