On Fri, 23 May 2003, Lee Daniel Crocker wrote:
* Cannot allow: # (sharp), | (pipe), " (quote),
[] (brackets),
{} (braces), <> (greater,less), + (plus), \ (backslash) because
allowing them would interfere with link syntax and make the
software more tricky to write. I can live without these, though
I think + might be handy in some places (like C++), and might be
worth the effort to allow.
Plus + and quote " are frequently asked for. These would not interfere
with wiki syntax at all, though both would require escaping in URLs (as
does the ampersand & when used in the query string and the percent % and
question mark ? always, all of which we presently allow).
* Should allow anything Unicode calls a letter,
numeral, syllable,
or ideograph.
Okay...
* Should not allow Unicode diacriticals, combining
forms, display
forms (ligatures), controls, and other specials.
Waitaminute... that would seem to exclude the use of accented characters
that do not have a precombined form. This could be seriously detrimental
to some languages.
(In any case, we ought to do a little fancier work with UTF-8 to make sure
that canonical forms are used to prevent false non-matches. I don't know
if there's a library we can link into PHP to do this or if we'd have to
write something.)
-- brion vibber (brion @
pobox.com)