We're tracking source/destination pairs generated by the ContentTranslation tool, right? Could someone point me to that dataset? (I'm playing around with some machine translation stuff to see if i can prototype a suggester tool that would translate edits on wiki A to corresponding edits on wiki B.)
--scott
PS. There's some cool work being done on "zero-shot translation"; aka bootstrapping translation tools for small languages by pre-training them on a related language (or even an unrelated language). Apparently that works! (Cf
https://arxiv.org/pdf/1611.04558.pdf) It can greatly reduce the amount of data required to build a translation model for the small language.
Is there a candidate "small wiki" that's been wanting to use ContentTranslation which would be a good candidate for experimentation?
--
_______________________________________________