I'm trying to render an image which uses characters
from all of the
languages supported by WP. Is there a single font deployed on production
servers that include all scripts?
The Autonym font includes characters for all the languages supported by
MediaWiki, but only a small subset:
https://www.mediawiki.org/wiki/Universal_Language_Selector/AutonymFont
Ryan Kaldari
On Wed, Jul 30, 2014 at 7:48 AM, C. Scott Ananian <cananian(a)wikimedia.org>
wrote:
In general, "one font to rule them all" is
highly
discouraged/impractical as a means to achieve reasonable results in a
variety of world languages. Indic fonts, for example, typically
contain complex shaping engines in bytecode -- it's just not practical
to try to write one engine for everything. All of the "one font with
wide coverage" attempts that I have seen look okay for Latin
languages, but fail for the rest of the world.
<rant>...which is typically the opinion inadvertently expressed by
these "characters from every language" projects anyway. Without real
knowledge of the rest of the world's languages and scripts, we get
something that shows that the creator valued the rest of the world
only for "looking exotic", and was not interested in true
understanding.</rant>
Most modern font systems have a mechanism to merge multiple fonts
under one virtual name as needed in order to get good coverage. So
you don't need to find a find font to rule them all.
--scott
ps. "Font synthesis" systems actually have a big problem in that parts
of the unicode character space are shared by different languages with
different rules for shaping and ligatures, etc. So you really need to
explicitly annotate the language and then chose a font specific for
that *language*, not rely simply on codepoint. (Unfortunately much of
the "foreign language" content in wikipedia (ie short texts which are
not in the main language of the wiki) is not explicitly annotated with
language information.)
pps. for those actually interested in getting the details of world
writing systems correct, I could use some help with the new OCG PDF
rendering backend, which just went live in production yesterday. It
uses XeLaTeX, which actually does pay careful attention to Indic
shaping and ligatures, etc, but it is not a "modern system" as
described above in terms of synthesizing coverage from multiple fonts.
Patches would be helpful to make better guesses about the native
language of "foreign language" spans, which would then ensure an
appropriate font was used.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l