On Nov 17, 2003, at 20:06, Lars Aronsson wrote:
Erik Moeller wrote:
perhaps in the 5% range. Given that a single edit
by such a person
will
break an entire page, it might not be so wise to switch (but perhaps
I'm
missing something -- is Meta running UTF-8?).
Would it be possible to let the database run on UTF-8 internally, but
to let the PHP script analyze and convert data to and from certain
browsers? Perhaps the majority of users are using UTF-8-capable
browsers, so the conversion would use a minimum of resources.
Certainly possible, as long as care is taken to keep round-trips clean.
Another possibility is simply to 'blacklist' known problem browsers by
printing a notice/link to better browsers on the edit page warning that
they may have problems, as we now have a warning on long pages that
some browsers may have problems. (Though in that case we aren't
checking specific browsers.)
The main problem browser these days is Internet Explorer for Mac; it's
years out of date and the most recent version still doesn't grok UTF-8
for editing. The most recent Macs ship with Safari as the default, but
most existing Macs out there are going to have IE or (shudder) Netscape
4.x as the default browser.
All I know is that MySQL has better UTF-8 support from
version 4.1.x,
as described in chapter 9,
http://www.mysql.com/doc/en/Charset.html
The same goes for Perl version 5.8, but what about PHP?
PHP currently has pretty much no UTF-8 support aside from some
conversion functions. Strings are treated as arbitrary-length byte
sequences, and we've got some custom functions to deal with case
changing and the like.
There are some multibyte character set support functions which may or
may not be suitable for replacing the Utf8Case functions, that should
get looked into.
-- brion vibber (brion @
pobox.com)