Peter wrote:
>Not at all! It clearly shows what happens when a page
>_IS_ 8859-1 encoded but editors want to use fancy
>characters. Same hppens when they do it with an old
>browser on UTF-8 pages. So, you get trash either way,
>other editors revert the same way, so you may well
>use utf-8, don't you? :-)
OIC. Well a person should /not/ be using fancy things
like curly quotes and long hyphens because many
browsers (especially on non-MS systems) display them
as question makes. These should be fixed, not allowed
to propagate. The fact that some browsers break these
codes should be a good hint that they should not be
used to begin with (esp. since ASCII quotes and
regular hyphens can be used instead).
But a way around the larger issue is to sniff whether
or not a browser is UTF-8-aware and then serve a page
in either UTF-8 or in Latin-1 (whatever the ISO) based
on that. When a UTF-8 page is displayed it shows the
actual non-Latin characters, when the Latin-1 page is
displayed it shows the codes the represent those
characters.
That at least will prevent pages from getting damaged,
but the special characters will still show up as
question marks for people with older browsers, so
things like curly quotes and long hyphens should be
automatically converted to their ASCII counterparts.
-- Daniel Mayer (aka mav)
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree
Mav said the following quoted text:
>Well a person should /not/ be using fancy things
>like curly quotes and long hyphens because many
>browsers (especially on non-MS systems) display them
>as question makes.
I expect a couple typographists to get serious heart problems from that.
>These should be fixed, not allowed
>to propagate.
Please! Let's all return to the 6-bit ASCII set, everything else is
bloat, and unsupported by older systems.
>The fact that some browsers break these
>codes should be a good hint that they should not be
>used to begin with (esp. since ASCII quotes and
>regular hyphens can be used instead).
Just as I said above.
Also, from another post:
>When I see "You need to
>upgrade your browser" I leave and never come back.
Please, also, keep your Netscape 4 and watch it break every standard and
recommendation that W3C proposed. You may also try things with
NetPositive and the likes, which are though newer, but support
/nothing/. Hell, lynx and (e)links do better at 'splaying a page. And
indeed, if your browser, or whoever's browser SUCKS, then sorry, compile
a links. It's shouldn't be /that/ hard, especially for hardcore users
of heavily outdated servermonsters (which are, granted, otherwise quite
good).
Unicode is the Present. Live With It™.
--
"Your job is being a professor and researcher: That's one hell of a good
excuse for some of the brain-damages of minix." (Torvalds to Tanenbaum)
--
Ralesk Ne'vennoyx
[ICQ:37046326]
>From: Brion Vibber <brion(a)pobox.com>
>Mav, do you have a constructive suggestion for
improving >compatibility
>with older browsers? If so, please post it on
>wikitech-l, where we're
>discussing the issue. We already know you don't like
the >idea of
>breaking older browsers.
He does not like because that is unnice to newbies,
and that generates bad vibes between oldbies (does
that exist ?) as well.
I am glad I do not break pages any more, either
because of utf (Opera), or because of the cut at 30 ko
(Opera), or by the freezing editing window (IE) or by
the random spaces additions in words (Netscape 4). I
still mess with html though :-)
An idea (perhaps totally stupid), would it be possible
that *some* pages are in utf, while most are not.
So, the pages that could benefit utf (such as the list
or arabic terms or such) are in utf, perhaps with a
box to check when saving. and the others are left as
is. Vincent et al could work nicely, most readers
would perhaps see more things, but the pages where
problems would occur would be limited ?
Is that possible ?
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree
Tomasz-
> * broken browsers - they should be upgraded, if
someone has browser
so old
> that it doesn't grok UTF-8, it's not going to
grok CSS,
> PNGs, and other things we're using either.
> Unless we want to remove all CSS and PNGs,
there's
> no point in not using UTF-8.
Is this true? All I know is that we had a *lot* of
problems with broken
special chars on the Meta-Wiki during the logo
contest. I have no idea
which browser broke them, but it seems to be a not
totally uncommon
one, perhaps in the 5% range. Given that a single
edit by such a person will
break an entire page, it might not be so wise to
switch (but perhaps
I'm missing something -- is Meta running UTF-8?).
Regards,
Erik
I confirm it is in utf 8
I broke many pages for months there, and refrained
from editing many others, and wrote all french pages
without any accents.
I also noticed during the logo contest, that some
people were messing pages as well, so obviously some
people are still using them (recommand Netscape 7 for
Mac OS 9).
I really fear we will have to check very carefully any
newbie contribution, and perhaps tell them to go away.
I hope we consider this carefully, before switching
everything. I understand there would be some benefits.
But perhaps some drawbacks as well. Any relevant
number (of non utf supporting browser) might be
interesting. Imho.
__________________________________
Do you Yahoo!?
Protect your identity with Yahoo! Mail AddressGuard
http://antispam.yahoo.com/whatsnewfree
Brion wrote:
>Mav, do you have a constructive suggestion for
>improving compatibility with older browsers? If so,
>please post it on wikitech-l, where we're discussing
>the issue. We already know you don't like the idea of
>breaking older browsers.
I don't care about breaking old browsers, I care about breaking articles and
not being at least minimally accessable to people who use old browsers. One
thing is certain, however: Switching to UTF-8 will not improve compatibility
with old browsers or improve the experience of the people who use them.
-- Daniel Mayer (aka mav)
Erik wrote:
>Is this true? All I know is that we had a *lot* of problems
>with broken special chars on the Meta-Wiki during the logo
>contest. I have no idea which browser broke them, but it
>seems to be a not totally uncommon one, perhaps in the
>5% range. Given that a single edit by such a person will
>break an entire page, it might not be so wise to switch
>(but perhaps I'm missing something -- is Meta running UTF-8?).
IIRC meta is. And that fact has created some of the problems you mention. I
therefore see no compelling need to convert Latin-1 languages to UTF-8 and in
fact think such a switch would be harmful. It is also wrong-headed to state
(as Tomasz did) that if people have non-UTF-8-friendly browsers that they
should upgrade. That is not the attitude we should have when things work just
fine the way they are (at least on the English Wikipedia - others may have
more compelling reasons to use UTF-8 that outweigh the negatives).
The only place where UTR-8 would be very useful is with interlanguage links.
But that could better be solved by placing all interlanguage links outside of
the regular wiki text of pages. That separate edit window could support UTF-8
and be shared by all Wikipedia's. This should minimize damage done by
non-UTF-8-compliant browsers and as an added benefit could be part of an
easier way to add language links to articles (inputing the links once would
create language links in every article listed in the common meta space).
-- Daniel Mayer (aka mav)
Tomasz wrote:
>1.
>There are many reasons other than interwiki. ISO 8859-1 is
>broken by design - it doesn't even encode all Latin characters,
>and other characters are also needed for correct Latin-script
>typography.
Oh, which ones? All the Latin characters on my keyboard work just fine.
Extended Latin characters also work fine. If you want anything more fancy
then use &code;.
>2.
>Things are NOT fine the way they are. At least not for English
>Wikipedia.
That begs the question. What is wrong?
>3.
>And, as I said, we already break compatibility with very old
>browsers in many ways. Or do you maybe want to ban all PNGs,
>OGGs etc., and implement some converter from CSS to HTML3-
>compatible markup ?
Come on. Not being able to view PNGs, OGGs, fancy CSS tricks and HTML-3 only
stuff does not harm any article. However, somebody using an older browser can
really screw-up an article on at UTF-8 page. This already happens fairly
often on meta, even though meta has very low traffic compared with
en.wikipedia. You cannot compare the two.
UTF-8 is a non-starter for en.wikipedia, and probably for most other
Latin-based wikis as well. But the choice of charset should be up to each
wiki and that choice should reflect the needs of that wiki.
-- Daniel Mayer (aka mav)
Please help me to get rid of the so called Admin status (German
WP). Who is able to tweak the database system accordingly?
--
| ,__o
| _-\_<,
http://www.gnu.franken.de/ke/ | (*)/'(*)
SunTrust account is unchanged since last update, unless some bank
fees were charged:
$1,862.23
Paypal has various amounts in various currencies, with a USD value of
$1,533.20 -- this is up just over $200 from the last time I reported,
1 week ago.
I can absolutely that removing the donation text from all the pages
has resulted in a significant drop in incoming funds, as expected.
--Jimbo
Thanks for arranging this, Jimbo.
I'm not sure everyone realizes how much of a sacrifice in time and money
it is, for Jimbo to do all the shopping and follow-up - and for him to
send employees of Bomis to set up equipment.
Ed Poor