Let me preemptively apologize for the tone of this message.
Quick summary: if 'zh-min-nan' is not desirable, change to
'cfr.wikipedia.org'. See reasoning below.
Tim Starling, 2004-07-07 21:56:46+1000:
The reason for choosing zh-cfr can be summarised in
one word: Aromanian.
A wiki in the Aromanian dialect was requested, and it was agreed that we
should have one, but there was a problem with choosing the subdomain.
Aromanian does not have a distinct ISO 639 code, however it is listed in
the Ethlnologue under the code RUP. We had used only ISO 639 codes up to
that time.
The solution we came up with was to use the group ISO 639 code followed
by the SIL code. This allows us to specify most languages in the
Ethnologue without conflicting with the ISO standard, since ISO 639
codes do not contain hyphens.
All right. Now, explain why it is 'roa-rup' rather than
just 'rup'? Why 'zh-cfr' rather than just 'cfr'? That is
mixing ISO 639-1 unnecessarily with SIL code, which creates
a monster: see paragraph below.
Maybe it is too late and the damage has been done, but one
of the several reasonable candidate solutions would be:
(option a)
Wikipedia host 2-letter sequence -> look up in ISO 639-1
Wikipedia host 3-letter sequence -> look up in SIL Ethnologue
But the current solution is:
(option b)
Wikipedia host containing hyphen -> look up in both ISO 639-1 or 639-2 and SIL
And there is no understanding of which language is referred
to when one encounters something internally contradictory
such as
en-frn.wikipedia.org. Is that English (ISO 639-1:
en) or French (SIL: FRN)?? (One may argue that it will
return 'This Wiki does not exist', but I still do not see
why (b) is in any way necessary or better than (a) other
than the farcical possibility of including ad hoc pidgins.)
Using the hyphenated language tags assigned by IANA,
such as zh-min-nan,
would conflict with this scheme. For example if we used zh-yue, it would
be difficult to know what "yue" refers to. Is it an SIL code or an
assigned code?
(option c)
No, it would not. RFC 3066 adopts all ISO 639-1 language
codes. Just look up in RFC 3066 (and transitively ISO
639-1).
We could use the RFC 3066 codes instead. This is still
an option.
However it wouldn't give us access to a large number of languages
without resorting to awkward constructions such as x-sil-RUP.
(option c, corollary)
And that is the thing one should do: using constructions
such as x-sil-RUP when nothing more appropriate comes up
elsewhere in RFC 3066. That is why such standards are
useful -- they are usually very precise and expressive,
however awkward one may consider.
If I'm wrong about that, feel free to explain it
to me.
I have been arguing for the hardcore option (c), but I
think a compromise is to consider option (a), which will
involve
(option a, implementation)
1) changing 'minnan' and 'zh-cfr' to simply 'cfr'
2) changing 'roa-rup' to 'rup'
[3) any other adjustments]
[By the way, much of the SIL-inclusion problems have been
argued about when RFC 3066 was being drafted ... and the
result was x-SIL-RUP etc. Sorry to be speaking a bit ex
cathedra, but I resent fighting battles already-won....]
If I understand correctly, Shizhao's problem is
that
holopedia.net, and
by extension
minnan.wikipedia.org, is written in a script peculiar to
Taiwan. The writing there is thus not representative of min-nan
generally. So wouldn't it be better to use the RFC 3066 code specific to
Taiwanese, namely zh-min-nan-TW? Or indeed, in keeping with my earlier
point about such language codes being cryptic and unnecessarily lengthy,
why not use ho-lo-oe.wikipedia.org?
To give a glib answer: I do not think Shizhao speaks Minnan,
otherwise he will rather contribute to Holopedia in whatever
way he writes Minnan, or apply for zh-min-nan-CN.