[WikiEN-l] Re: Pronunciations and IPA/SAMPA

Adam Raizen araizen at newmail.net
Wed Sep 10 22:59:52 UTC 2003


David Friedland wrote:
 > Anyhow, it seems that just using the HTML entities for the Unicode IPA
 > extensions is not an acceptable solution because it leaves IE users with
 > lovely but useless rectangles where there ought to be IPA characters.
 > There is a LaTeX extension called TIPA that allows the complete set of
 > IPA characters and diacritics. If this were installed into the TeX math
 > extensions, then a similar syntax could be used to generate images of
 > the IPA from LaTeX input.
 > I see the following possible solutions (in the order that I think is 
good):
 >
 > 1.) Auto-detect the browser and send IPA Unicode to browsers that
 > support it and TIPA LaTeX images to those that don't. (Pros: attractive
 > display of IPA for all users. Cons: lots of  programming)
 >
 > 2.) Just send TIPA LaTeX images (Pros: attractive display of IPA. Cons:
 > Uses images in text when for some users embedded IPA Unicode would look
 > better)
 >
 > 3.) Store the IPA in a special format or in a special tag, auto-detect
 > the browser and send IPA Unicode to browsers that support it and SAMPA
 > to the rest. (Pros: doesn't require inserting images or using TeX. Cons:
 > SAMPA is ugly and hard to read)
 >
 > 4.) Render IPA into GIFs or PNGs and just insert them as images. (Pros:
 > compatible with everything. Cons: time-consuming, and difficult to 
change)
 >
 > 5.) Devise a Wikipedia-specific pronunciation scheme and just use that
 > (blech!) (Pros: no coding required. Cons: YAAHPS (Yet Another Ad Hoc
 > Pronunciation Scheme))
 >
 > 6.) Do nothing and continue to allow people to use ad-hoc pronunciation
 > schemes (BLECH!!) (Pros: no action required. Cons: maintains status quo
 > harms as described above)

I was just thinking of this problem, and the idea I came up with was to
have an option in user preferences of something like "Display
pronunciations in: o Unicode IPA o SAMPA" and then anything in an
article which begins with "SAMPA " would be detected and displayed
correctly (converting SAMPA to IPA if necessary), similarly to the idea
with the magic ISBNs. I think this is probably the simplest solution to
get working quickly, and it can be easily expanded to include additional
ASCII IPA schemes (there are several) or auto-generated IPA images if
someone implements that. Also, someone using IE but who has the correct
fonts installed would be able to see IPA.

You malign ad hoc pronunciation schemes, but they do have *some*
redeeming value. You can use a single ad-hoc system to represent 
different dialects more easily than you can use IPA for the same 
purpose, since users will read their own dialect into the pronunciation 
guide for the ad-hoc system. Still, I can't imagine making up an ad-hoc 
scheme for wikipedia; IPA is probably best for us.

I'm digging around the code to see how this could be done (and learning
PHP), but in the meantime, any comments?.

(Anything more on this should probably go to wikitech-l.)

-- Adam Raizen





More information about the WikiEN-l mailing list