Re: [Wiktionary-l] [Commons-l] Sound files - Wiktionary-l

11 Feb 2007

Hoi,
The problem is that existing academic software like "praat" use .wav 
files. I do sympathise up to a point that storage is used. However, the 
price of a terabyte of storage is such that this is not that relevant. 
Both an .ogg and a .wav file would be saved. The first is to enable 
science to do its thing, the second is for our punters.
Thanks,
    GerardM

http://www.fon.hum.uva.nl/praat/

Gregory Maxwell schreef:
...
  On 2/11/07, Gerard Meijssen
&lt;gerard.meijssen(a)gmail.com&gt; wrote:
  Hoi,
 I read this in digest mode so let me answer things together.

 The reason why .ogg files are not great is because indeed it is a lossy
 algorithm. There is some great software to analyse pronunciation files;
 a program called "praat" is worth mentioning it is even licensed under
 GPL. There is even functionality in there to do with IPA transcription.

 Gregory's proposal to use Ogg/FLAC is not helpfull. This is not the
 format that is used to analyse pronunciation files. The notion that a
 specific quality was "the gold standard" at the time is indeed that. It
 used to be, times have changed.

 The Shtooka program that we are talking about CAN create both a WAV and
 an OGG file. It just needs asking. It would be helpful if we learn
 sooner rather than later what the outcome is of this request. 
 The Ogg/Flac is lossless, so it removes your concerns about lossyness.
 It can be uploaded today, so it removes the problems of not being
 uploadable. It is compressed (losslessly) so it's not quite so bad on
 our storage and bandwidth. Shtooka already outputs Flac, and could be
 trivially altered to output ogg/flac, if you'd like I will do this for
 you. Any number of Ogg/Flac files can be quickly converted to wav with
 a single command.

 I am very hesitant and concerned about the prospects of permitting
 uncompressed files: I think people will use them where they are
 completely inappropriate because they are a bit easier to playback.
 Flac or Ogg/Flac should be substantially smaller than wav and won't
 drive people to use uncompressed formats for bad reasons.