Groan..
I just discovered that the version of mpg123 on the latest Redhat, which is what I use to decode mpegs into .wav, is actually not mpg123 but a link to mpg321, which silently ignores the options i give it telling to downsample by 2 and convert to mono. So all the pfiles (and hence the .htk files) are twice as big as they should be.
The annoying thing is that I'm not quite sure whether the anchor nets were trained on these files, or on files that were correctly downsampled. It looks OK: the nets are dated July 31 2002, which means they were trained at NEC. I looked there, and those pfiles (dated from that July as well) are OK, ie. half as small as they ones I have now on blush.
Comments
a: I finally got a great mp3 decoder with easy source (the others were too optimized/convoluted to adapt.) - libmad--an integer decoder. There's a simple hook you can use to run stuff against the pcm buffers. Also can get at the coeffs in the .mp3 chunks.
Posted by: brian | March 30, 2003 06:35 PM