Re: [Wikitech-l] UTF

11 Jan 2003

Tomasz Wegrzanowski wrote:

...
 youandme wrote: 
...
 >Writing in any language is writing in this specific
language
>respecting _it's_ rules, even if the lack of diactrics is a rule. 
...
 No. Its rules must be obeyed only for words of that
language.
For words from other languages, like names of people, places etc.
rules of source language must be obeyed. 
Have y'all read the discussions from November on <wikiEN-l>
(often spilling into <wikipedia-l> since the English list was new)
on the English Wikipedia's policy of anglicising names?
There was a lot of argument that could give context here.

...
 >Instead of rather writing in the language overriden
by some
>extra syntax. Giving a correct spelling once in parentheses
>is what IMO is sufficient. 
...
 Without UTF-8 you can't even do that. 
You can (and we often do, on [[en:]]) using HTML entities,
such as &#268; (for "C" with a hacek, TeX's "\v C").
What UTF-8 encoding does is to allow:
* Direct entry of the UTF-8 character into the edit box;
* UTF-8 characters in titles.

Direct entry of even Latin-1 characters into the edit box
already screws up a few browsers every once in a while,
so I always change them to HTML entities when I see them.
Thus, the reason for UTF-8 would be non-Latin-1 titles.
Since [[en:]]'s policy favours anglicisation
(more than *I* would like! ^_^), this isn't vital.

Still, there are a few times when it would be appropriate under the policy.
When there is no English standard for a foreign name,
then we give it diactritics in the title if it's Latin-1;
with UTF-8, we could extend this practice to other Latin alphabets.
And (thanks to TeX, I suppose), mathematicians commonly use
any feature of the Latin alphabet supported by plain TeX,
including the aforementioned &#268; (as in "C(ech homology").
So UTF-8 encoding would still be useful on [[en:]] --
just not (IMO) a pressing concern.

-- Toby

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] UTF