On 7/25/05, Tomer Chachamu <the.r3m0t(a)gmail.com> wrote:
On 25/07/05, Nikola Smolenski
<smolensk(a)eunet.yu> wrote:
The way I see it, this decision is a political
and not a technical one. Each
word could have several spellings, each of which is related to a spelling
authority. If you want common misspellings in the dictionary, simply have
"Common misspelling" as a spelling authority. Similarly, nothing prevents you
from having several different spellings of a same word attributed to a single
spelling authority, which solves all the problems you mentioned above.
Surely it would be better for the search function to try some other
possibilities, such as removing glottal stops from the input and
searching again.
Besides, didn't we forget the Unicode possibility of the same
character entered in two ways?
http://en.wikipedia.org/wiki/Canonical_equivalence
The current version of MediaWiki does Unicode canonical
normalisation. I would hope that UW would too. Compatiblity
normalisation is a different kettle of fish though.
Actually, ideally every word would have a "search
name" which is
worked out based on the language. It would only contain the compulsory
characters, with a certain decision on e.g. German ä ö ü (to transform
them in one direction or the other).
Well since there are German words which can be spelled with ae, oe,
ue; but not with ä, ö, ü, the diacritic versions have to be the canonical
ones. The same goes for French œ which can always be spelled as
oe but not every oe can be spelled as œ. However, for ß, it always the
wrong spelling in Switzerland but not every ss can be spelled ß
either in German and Austria.
It could also decompose all
precomposed characters into sets of characters. This is a little bit
like how some databases store the soundex index.
Unicode canonical normalisation does this.
I think this would handle most cases. I certainly
agree that redirects
are a necessary technical feature for the rarer cases.
Or maybe thinking outside the MediWiki box altogether if US is
to be designed from the ground up.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
--
http://linguaphile.sf.net