Hi Gerard,
Thank you for your answer.
The German situation is a bit difficult. In actual
fact there are only
two orthographies because two Bundeslander did not pass as law that the
new spelling would apply there as well. The consequence is that both old
spelling and new spelling are valid. In a typical situation, the words
that have been changed would get dated and be outdated. From a practical
point of view I would only have the changed words and the new words
included and I would treat them as if these two Bundeslander had voted
in favour. For lookup purposes the difference is a SELECT statement in
the query statement.
So you do not want to include the old spelling? From what I
understood for Low
Saxon you also wanted to include historic spellings. But I may have
misunderstood that.
The argument why all words have to be explicitly
identified as belonging
to an orthography is because it allows us to do other things than just
producing lexicological information from the Internet. What in your
perception is an "multiplication of entries" is in actual fact no such
thing; an expression is registered only once for each language, dialect
or orthography.
So
number of entries = (number of languages) x (number of dialects) x (number of
orthographies)?
What are you planning to do with American English vs. British English?
You would have two entries:
1)
title=colour
lang=EN
dialect=EN_US
orthography=USA-official
2)
title=color
lang=EN
dialect=EN_GB
orthography=GB official
That is fine. But what about "bus"? would you have two entries?
1)
title=bus
lang=EN
dialect=EN_US
orthography=USA-official
2)
title=bus
lang=EN
dialect=EN_GB
orthography=GB official
That (to my understanding) would double the entries for English, wouldn't it?
And the translation of de:Bus would list en_US: bus, en_GB:bus?
Kind regards,
Heiko Evermann