I'll just presume the reason nobody responded to me is because I seemed
to be all talk and no code.
So I have posted everything to my personal web site:
<http://www.nohat.net/OutputPage.php.diff>
<http://www.nohat.net/wikistandard.css.diff>
<http://www.nohat.net/Makefile>
<http://www.nohat.net/sampa2unicode.lex>
You have to put the "sampa2unicode" executable built by the Makefile
into the same directory as OutputPage.php. Try it, you can put stuff
like <IPA>plEZ@`</IPA> onto pages and get nice IPA output.
If there are no objections and someone gives SF user Nohat CVS access, I
can commit these changes.
- David [[User:Nohat]]
David Friedland wrote:
The last discussions on the lists about how to
represent pronunciations
on Wikipedia didn't end with any definitive conclusions.
One of the proposed suggestions was to provide a way input IPA in an
easy-to-use format and then have it automagically get converted into
Unicode IPA. As a first step I wrote a Lex analyzer to convert X-SAMPA
to Unicode IPA (which results in C code that I compiled to an
executable). This will enable editors to enter pronunciations in the
ugly-but-easy-to-type X-SAMPA format and have them appear in IPA using
the Unicode IPA extensions.
I also dove into the Wikipedia code and patched OutputPage.php to
support <IPA> </IPA> tags that surround SAMPA and output it as <SPAN
class="IPA"> </SPAN>, with the Unicode IPA HTML entities inside the
SPAN
tag. Finally, I patched style/wikistandard.css to have a .IPA style that
explicitly sets the font-family to a list of fonts that are known to
contain the IPA Unicode extensions. This is necessary because some
broswers (namely Windows IE) don't display IPA Unicode characters even
if they're installed unless the currently active font has those characters.
I envision the use of <IPA> tags on Wikipedia to be fairly limited, in
that IPA/SAMPA will only appear on pages discussing pronunciations, but
I think it will make a good starting point, perhaps, for representing
pronunciations on Wiktionary.
I can attach the diffs for OutputPage.php and style/wikistandard.css to
a future message (if I can ever get the sourceforge cvs server to
respond), but what should I do with my .lex file and Makefile for
building the parser? Should I post them too? I would guess most list
readers would not be pleased by my spamming them with such tediums. I
understand the hesitation developers have to handing out CVS access to
people whose code they have never seen, so email me privately if you
want me to send you what I've done.
Thanks!
- David [[User:Nohat]]
Note: for most purposes, X-SAMPA is backwards compatible with SAMPA for
various languages, but the Lex analyzer can be modified to support
something like <IPA lang="French"> or something like that for the
language-specific SAMPA encodings, if that is deemed desirable.