Hi Gerard,
Signed languages are completely independent systems, and are separate
languages from spoken languages, different in grammar, syntax, and the
like. Unfortunately, there is no universally-agreed-upon method for
transcribing signed languages.
There are a few possibilities here.
1) Choose a particular transcription system. Top of the list would be
Stokoe, and Sutton Sign Writing; HamNoSys is also possible but is
mostly used by linguists while the other two are used more widely by
people who use it as their everyday language.
2) Use multimedia. We can upload videos of people signing a particular
word. Note that some signed languages also have conjugations and
inflections. However, this will leave a problem of headwords -- how do
you look up a word in a signed language? Which leads to the third
option,
3) Introduce our own notation system. This is impractical and unlikely
to work well. I suggest that instead, we adopt HamNoSys for lookup
purposes, although it is not represented by Unicode, we can try an
ASCII implementation.
Regarding "dialects" of Chinese and Arabic, that is very simple. Treat
them as separate languages. While certainly most often people write in
"Standard Arabic" or "Standard Chinese", it is also possible to write
in the local vernacular. This tends to be done more with Arabic, but
is possible with either. With Chinese, you only see it very often with
Cantonese, other varieties are occasionally but you are more likely to
find a Bible translation in them than a newspaper.
I hope very much that you will not restrict languages to those which
appear on the ISO 639-3 list. It has many shortcomings and is very,
very, very disappointing -- it would not allow for separate entries
for Yavapai, Hualapai, and Havasupai (it has only one code for them
all), even though they are very much different languages, and it by no
means includes all the languages of the world. It also separates
between Moroccan, Tunisian, and Algerian Arabic, when they're really
nearly identical.
Mark
On 20/07/05, Gerard Meijssen <gerard.meijssen(a)gmail.com> wrote:
Timwi wrote:
Gerard Meijssen wrote:
I would welcome your comments about the ERD that
I posted here
http://commons.wikimedia.org/wiki/Image:ERD.jpg
Looks interesting, but is extremely bare. It would do well with a bit
of documentation. For much of it, the purpose isn't entirely clear.
I'm particularly confused as to why "Language", "Word" and
"Meaning"
are each duplicated.
Hoi,
There is some documentation here:
http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design and
http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_decisions_on_its_usage
here.
The duplication reflects that there is at least one table that has two
relations to the same table. Language refers to itself for dialects,
Word refers through Conju/Decli (conjucation or declinations) to a
headword and derived words, Meaning is related through "Relations" this
is to allow for thesaurus like structures.
One reason why it is not as much documented as I would like is, because
I am still working on the structure. At this moment I am thinking hard
on how to include signed languages and the spoken dialects of the
Chinese and Arabic written language.
Thanks,
GerardM
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
--
SI HOC LEGERE SCIS NIMIVM ERVDITIONIS HABES
QVANTVM MATERIAE MATERIETVR MARMOTA MONAX SI MARMOTA MONAX MATERIAM
POSSIT MATERIARI
ESTNE VOLVMEN IN TOGA AN SOLVM TIBI LIBET ME VIDERE