Hoi,
For your information, this is an "as is" implementation of the GEMET
thesaurus. It does not reflect how things will be done in WiktionaryZ.
It demonstrates that we CAN host a thesaurus as GEMET. All the
problems associated with this data can be found in this
implementation.
On 1/1/06, Heiko Evermann <heiko.evermann(a)gmx.de> wrote:
Hi Gerard,
I am extremely happy that I can inform you on
Boxing day that a tired
Erik has produced the first tangible result of the Wikidata / Ultimate
Wiktionary project. It shows that we want great content in many
languages, that we want to include thesaurus information and that we
are happy to include with gratitude content like the Gemet thesaurus.
First of
all: Congratulations for the working prototype. I think that UW is a
very important project and it is good to see that it is making progress.
Thanks
Our goal with
Ultimate Wiktionary is to provide an even more complex
application that will make this data collaboratively editable, to add
dynamic user-based views, APIs, and crucial features such as
inflections, etymologies, complex relations and attributes, and much
more. This will be a huge challenge. Fortunately, more funding seems to
be on the horizon, allowing us to put more developers on the job.
I have made some
experiments and this led to some thoughts and questions. I
hope that you do not mind.
I have tried
http://www.epov.org/wd-gemet/index.php/flower and have seen two
Entries: en and en-US. Both seem to have independent lists of translations,
as en lists en-US as translation but en-US does not list en as translation.
How will this be handled? What is your way to make sure that the definitions
and translations remain consistent? In the case of flower there should be no
distinction between en and en-US whatsoever. I have asked this before, but so
far I have not received a response that really solved the matter.
First of all en-US is NOT a different language even when it says so at
this moment. en-US is a different locale.
I have also had a look at the entries for
"color" and "colour". The
interesting thing here is that one (en) has a definition and one (en-US)
doesn't.
See above
And when I navigate to the different translations,
(e.g. German: Farbe), I get
the same list of definitions in several languages. However, when we translate
Farbe back into English, we would have to make the distinction between two
different meanings: color (hue) and paint (for the wall). As all the
different translations seem to be forming one ring, I wonder, how the meaning
would be split in the German entry, and how one would select the translations
into pl, sl, gb onto these two meanings.
For this I have to refer you to the data design. Farbe will be
associated with two DefinedMeanings. These are not both defined at
this moment. Translations are associated with a specific
DefinedMeaning (again see the data design)
Kind regards,
Heiko Evermann
Happy New Year,
GerardM