[Gerard Meijssen ([Wiktionary-l] nds is now lowercase ..) writes:]
>>
>> In your mail to the wikitech-l you said that you want to use redirects
>> to indicate alternate spellings. It would be a better idea to treat them
>> like we do with translations that is, have a proper description of the
>> word and its origin (what orthography or origin) and refer for all the
>> rest to a selected word.
[Pardon me butting in. I have just joined the list and this is my
first posting.]
Apropos of alternative spellings, I am exploring the idea of moving
my 100,000+ entry Japanese-English dictionary(*) into a wiki-like
environment. One aspect of Japanese is that most words have at least
two ways of writing them, and many have more. These are not lemmae; they
are valid orthographical variants. I have stuck very solidly to the
position that no entry can be restricted to a single headword. Multiple
headwords are an intrinsic part of the present (XML) structure, and users
need, and expect, to be able to go directly to the entry using any of
the written forms.
If wiktionary were to end up importing this file (and there are many
things to be considered first), I could not see it working if there
were a separate entry for every orthographical variant. Since wiktionary
seems to have no workable concept of multiple headwords, redirects *may*
be a suitable path to take, especially if they could be generated
automatically at upload time, or at the time of creating a new entry.
Even that would be problematical, in that one of the accepted forms
of a word is its representation in the (quasi-phonetical) hiragana
script, and Japanese has a vast number of homophones. (For example,
I have 19 entries with the pronunciation, and hence kana form, of
koukou.[+])
My penny's worth.
Jim Breen
* See:
http://www.csse.monash.edu.au/~jwb/j_jmdict.htmlhttp://www.csse.monash.edu.au/~jwb/edict.html
+ See:
http://www.csse.monash.edu.au/cgi-bin/cgiwrap/jwb/wwwjdic?1MSJkoukou
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
First of all: sorry for crossposting to all lists, but I need to reach
as many people as possible.
There are two things I would like to talk about and maybe this mail is
going to be quite long.
At this stage I took over part of the translation's organisation (or all
of it? we'll see ...)
There are some problems involved when it comes to translations for
wikimania - on one hand we will have people that know how to edit and
that go simply there and edit an article on the other hand we will work
with translators who don't know how to do this and therefore they
receive the article and send it back to me - then their work needs to be
copied and pasted into the wiki.
Therefore I created a page where to collect the names of people who work
for wikimania and a yahoogroup to assure that every article is only
translated once.
The translator's list can be find here:
http://meta.wikimedia.org/wiki/Translators_that_are_willing_to_co-operate
It is organised by language combinations. I also added myself, but
really I suppose I won't have much time to translate since I will need
the time for organisational tasks.
The translator's list is on yahoo: wikimaniatrans(a)yahoogroups.com
I am inserting all translators there since all articles that need
translation are posted there. The first translator who answers to the
list with "I am doing this translation EN-IT" (just to make an example)
will translate that language combination of that particular article - so
all other translators know that this work is being done.
So if you are interested in co-operating for translations, please add
yourself to the list and subscribe to the yahoo-group. If you won't do
this yourself: just send me a mail indicating
Language combinations
Name
e-mail
Website (if any)
As to the reporters: to be faster in our work it would be great if you
sent me a note with the link as soon as your article is online and ready
for translation. I know that this is not the normal "wiki-way" but we
have to deal with people that are not used to wikis and we don't have
the time to teach them how to work there.
Please drop me a note so that I know you before (it would be great).
As to admins: please post the message that we are on search for
translators in your Beer Parlours and ask people who are interested to
contact me by e-mail (either sabine_cretella(a)yahoo.it or
s.cretella(a)wordsandmore.it) - I understand German, Italian, English,
French, Spanish and write German, Italian, English - I approximatively
understand some more languages, but it is not the case to guess what is
written in this situation, so please bear with me and accept these
limits :-)
For now that's all - the more we are the better we will be and the less
work anyone will do - if there's someone who would like to help with the
copy/paste upload: please let me know.
Ciao!!!!
Sabine
*****
Sabine Cretella
Translations
s.cretella(a)wordsandmore.it
skype: sabinecretella
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
Hoi,
Thanks to Brion the nds wiktionary is now lowercase enabled. This means
that all article names will indicate the proper spelling when a word is
added.
I have a 2775 long list of nds words with German translations that I am
given permission to use thanks to Germaniennet. I am happy to upload it
to the nds wiktionary. It would be great if you adopt the template
system that is used by many of the other wiktionaries. So have a look at
it and let me know what you think of it..
In your mail to the wikitech-l you said that you want to use redirects
to indicate alternate spellings. It would be a better idea to treat them
like we do with translations that is, have a proper description of the
word and its origin (what orthography or origin) and refer for all the
rest to a selected word.
Thanks,
GerardM
Hoi,
I had an interesting conversation with Brion. We do not agree on
everything. One of the things we do not agree on are redirects.
In my opinion, Wiktionary should not have redirects. A word is either
spelled correctly and it will have its lemma or it is not and there will
not be a lemma with the incorrect spelling.
In Brions opinion there are links to lemmas and as we need to ensure
that these links remain ok, we need redirects to make this possible.
In a Wikipedia context I am 100% with Brion. In a Wiktionary context it
is a different matter. As only correctly spelled words should be in a
Wiktionary, errors should be deleted. Some of our Wiktionaries for
historical reasons are capitalising their articles. In essence this
means that from a spelling point of view the name of the lemmas are
irrelevant. However, many people assume that the name of the article
indicates that a word is spelled correctly. To remedy this, more and
more wiktionaries are moving away from first character capitalisation
and make it possible to have correctly spelled words as a lemma.
When a wiktionary has made this move away from first character
capitalisation, the interwiki and interproject links within the
Wikimedia projects need to be fixed. After this, the redirects can in my
opinion be removed. I think this is appropriate because users expect
that an application behaves in certain ways. When new content is added
to a non-capitalised Wiktionary, the word foo will not have a redirect
in Foo and consequently it behaves differently from the content
predating the move to non-capitalisation. Also words like Kinder and
kinder are not related at all. The redirect at Kinder will be replaced
at some stage breaking the existing redirect and consequently not
providing the continuance that Brion holds dear.
For the Ultimate Wiktionary I have documented some of the design
criteria. It can be found here:
http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_decisions_on_its_usage
The Data design can be found here:
http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design
One crucial decision is that only correct spelling is allowed. This
means that all incorrect spelling will be amended or deleted. As
Ultimate Wiktionary is a database, it does not cater for things like
redirects. I urge you to have a look at both the design criteria and the
design itself because this is the time when it is relatively easy to
make changes. Once Erik starts coding the UW database, having finished
Wikidata and the GEMET implementation, the moment has passed us by.
Thanks,
GerardM
Rodolfo Raya wrote:
>On 7/14/05, Gerard Meijssen <gerard.meijssen(a)gmail.com> wrote:
>
>Hi,
>
>
>
>>I have started anew on an ERD for the UW.
>>
>>
>
>Have you considered designing an XCS template to use with TBX exports?
>
>You may want to take a look at the default template included in TBX
>specs and remove the fields that you consider irrelevant. Once you
>define the template, it would be easier to identify the requirements
>of the application and write an UML or ERD diagram.
>
>Regards,
>Rodolfo
>
Hoi,
The first thing I need to do is to make sure that we can host the data
in the database. To do that there are several requirements that I have
to allow for;
* The user interface must be in the language indicated by the user.
* Ultimate Wiktionary is to use its own dog food or, if some terminology
required in the UI is not there, it must be possible to add it within
Ultimate Wiktionary
* It must be possible to host all words of all languages in this database.
* We need cooperation of many people to make it all possible so I need
to acknowledge glossaries and thesauri that we are given to host within UW
* Users must be able to select the languages that they want to see.
The budget that we have to create UW is minimal. We will be happy if we
can get it to work and host the data.within a limited amount of time.We
have considered TBX. There are however TWO crucial things before we can
consider exporting to TBX, the first is IMPORT, the second one is
analysis of what the export should be for. If the Dutch content will be
222.000+ words to start of with and everyone starts hitting our servers
because you can, it makes for some optimalisation. If the export is
based on a need of only the changed content, it is different again. When
the need is associated with the projected reference implementation, it
makes sense to consider how we will ask translators to aid in the
content of the UW.
Sabine for instance already adds content to the Italian wiktionary based
on her translations. When we ask it as part of the reference
implementation for a translation glossary, we want to encourage
translators to work on the content like Sabine does, finding the right
way is crucial. When we start with UW, it may be important to have some
Quality Assurance measures built in. This is also very much intrinsic to
the database design and that is what is currently being worked on.
Currently there are some 17 tables that make up the ERD, there is need
for some more.
Over the last year I have had many conversations about what should be in
an Ultimate Wiktionary. Now I am trying to integrate it all. I have
posted the design to the uw-creations(a)googlegroups.com and I have posted
it at http://commons.wikimedia.org/wiki/Image:ERD.jpg As you should
understand it is a working document and if you have questions of
suggestions do not hesitate to discuss them with me and with others.
Thanks,
GerardM
Hoi,
I have started anew on an ERD for the UW. It still has few files, they
will grow but it is already quite complicated. I have looked at the ERD
from Ilse, and found it to be next to useless for a multilanguage
dictionary. This is MY first shot at it, I am still in design mode and
my next exercise will be to include pronunciations. That is more
complicated that it sounds I am afraid because you also have
trancriptions to consider and the fact that phonetic script is NOT
standard; Americans do transcipe to IPA and SAMPA differently from the
French or the Germans.
One lesson I have learned, since the last time I looked at this in
earnest I have learned a lot. This has complicated my perception and it
should make for a data design that is better at solving the issues as I
know them. I have looked at several designs and rejected a few, there
are some designs that I want to revisit.
I did send this mail to uw-creations, a google group that we set up for
the implementation of a reference implementation for a Computer Aided
Translation (CAT) tool. This design is as relevant to them as it is for
the Wiktionary-l. Everyone can read the content of this list, only
people who have been invited can post there. (drop me a line)
As always, I am happy for your comments. In the spirit of Open Source I
will publish often.
Thanks,
GerardM
Hello,
I'm planning to setup a dictionary of terms related to the field of plant
genetic resources, using MediaWiki. I want it to have all definitions in 5
different languages and there is a possibility of adding more.
I was wondering if the usual/recommended case of setting up a site like this
would be to follow the model of Wiktionary and have separate installations
for each language of the dictionary, using interwiki links between them (I
would have all installations on the same server and use interwiki links,
which I think amounts to the same structure of links as what we see in
Wikitionary).
I'm not sure if this is the best idea for my case since perhaps for Wiktionary
this installation fits the situation in regards to maintenance/curation and
server load and in my case there would less server load and centralized
maintenance/curation would be preferred.
So I suppose my main question is, are there more advantages to having
everything on one MediaWiki installation or on separate installations? Also,
what would the advantages/disadvantages be for each situation?
Any of your suggestions would be very much appreciated.
Thanks,
Andrew.
One of the effects of the recent page capitalization issue has been that
the notice about this has overwritten direct access from en:wiktionary
to the foundation vote pages.
I was only able to find it by rooting around special pages. It would be
helpful if the appropriate links were re-established
Ec
Yesterday we were talking about tables and structures for Ultimate
Wiktionary.
Repeating basically all we said time and time again. Gerard wrote some
notes on this on his blog.
http://ultimategerardm.blogspot.com/
The design phase for the tables of UW is apparantly starting .. Please
give us your ideas so that we will have the best UW possible :-)
Ciao, Sabine
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
Yesterday we were talking about tables and structures for Ultimate
Wiktionary.
Repeating basically all we said time and time again. Gerard wrote some
notes on this on his blog.
http://ultimategerardm.blogspot.com/
The design phase for the tables of UW is apparantly starting .. Please
give us your ideas so that we will have the best UW possible :-)
Ciao, Sabine