Delirium provides a good reference - yes, something we can agree on! :)
However, I'd take issue with the convertor being considered
"impossible by many experts." If a human being, with expert
knowledge, can map it across, certainly it is not in the domain of
"impossible" tasks.
In fact, ZH Wikipedia could be a leader in the development of such a
system, and create a modular open source solution for others to use
too. The "power of many" can certainly be seen in this task, which has
a variety of special cases for mapping based on context. The mapping
system itself could be run like a wiki, so people can contribute and
alter the rules as needed, as colloquial grammar and usage changes
with time.
The challenge now is getting enough critical mass and developers for ZH.
-Andrew Lih (User:Fuzheado)
On Tue, 22 Jun 2004 10:50:04 -0500, Delirium <delirium(a)hackish.org> wrote:
Roozbeh Pournader wrote:
I don't want to get into the debate, but just
FYI, such a convertor is
considered impossible by many experts. By impossible, I mean
impossible in the level of a perfect German to English to German
machine translation software. Refer to Unicode mailing list and its
archives for more details.
Just to provide some more concrete points of reference:
Jack Halpern and Jouni Kerman. "The Pitfalls and Complexities of Chinese
to Chinese Conversion". Proceedings of the 14th International Unicode
Conference, Cambridge, Massachusetts, USA, March 1999.
[
http://www.basistech.com/papers/chinese/c2c.html]
The biggest issue seems to be that certain simplified characters map to
one of multiple traditional characters, depending on context. Thus a
translator has to know the context, which requires solution of some
fairly daunting natural-language processing problems. Conversion from
traditional to simplified seems to be much easier, as it typically
collapses the total number of characters used.
-Mark
_______________________________________________
Wikipedia-l mailing list
Wikipedia-l(a)Wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikipedia-l