[Foundation-l] Push translation

Lars Aronsson lars at aronsson.se
Thu Aug 5 17:39:58 UTC 2010


On 08/05/2010 03:12 PM, Michael Galvez wrote:
> Sorry for coming into this discussion a bit late.  I'm one of the members of
> Google's translation team, and I wanted to make myself available for
> feedback/questions.

This is an unusual and most welcome step for Google. When I first
learned about GTTK in June 2009, I used it to translate a handful of
articles from English to Swedish. I'm glad that it's now also possible
to translate into English, but some of the errors are still there.

It's a great tool, and should be used more. We have a common
interest in improving it. But for this, wikipedians need feedback.
Which language pairs are most active? What words or phrases
does GTTK find problematic, and can we somehow improve that?
Google could benefit so much from collaborating with wikipedians.
Ultimately, Google could share some translation dictionaries, so
we could include them in Wiktionary, the free dictionary.

Users of Gmail or Google Apps want their privacy, but users who
translate Wikipedia articles are already sharing their results, so
Google could help us to find each other and make us collaborate.
Translations that start from a Wikipedia article could by default
be put in a shared pool where other wikipedians can find them.

To some details:

I need a way to mark in the original text that a phrase is a quote,
book title or noun proper that shouldn't be translated, but copied
literally. And in the statistics, those words should not be counted
as untranslated and block me from publishing the result. Optimally,
GTTK would learn over time where such literal phrases occur, e.g.
text in italics under the ==Bibliography== section.

English ==References== corresponds to Swedish ==Källor==,
even though the two words are not direct translations. GTTK was
pretty quick to pick this up. However, the different styles we use
for the opening paragraph of biographic articles, using parenthesis
around the birth and death dates in the English Wikipedia, but not
in some other languages, is something GTTK has not yet learned.

Categories should not be translated, but GTTK should follow the
interwiki links for categories. If none exist, perhaps suggest a
parent category.

Even for articles that already exist in the target language, we often
need to translate another section. For example, the Swedish Wikipedia
might have an article about Afghanistan with a good section about its
geography, but the history section needs improvement, and could
be translated from another language. The work-around is to begin
a translation of the whole article, but only translate the relevant
part and then cut-and-paste into the target without submitting
through GTTK. Perhaps GTTK could bring up both articles side by
side and suggest which sections are in most dire need of improvement?


-- 
   Lars Aronsson (lars at aronsson.se)
   Aronsson Datateknik - http://aronsson.se





More information about the foundation-l mailing list