On 4/13/06, Steve Bennett <stevage(a)gmail.com> wrote:
That would be nice, but even the simple mechanism of
exact matches
would be a start. And then you can add fall backs, like all upper
case, all lower case, upper case first letter of each word and so on.
If performance is the issue here.
All upper/lower isn't really effective on Wikipedia because most of
our multiword titles are mixed case.
AFAIR, most string matches in mysql are case insensitive, which would
mean that we could have indexed case insensitive matches quickly...
but I'm guessing that our use of binary fields for titles (which is
required because no version of mysql has complete UTF-8 support) most
likely breaks that.
Alas, mysql doesn't have functional indexes
(
http://bugs.mysql.com/bug.php?id=4990) or it would be fairly trivial
to offer a fast case-insensitive match.