I have a problem regarding conversion of PCM wav audio->TruSpeech->PCM.

Compression doesn't appear to be a problem - its the decompression.
The decompressed size is less than the original uncompressed size, and curiously, the number of bytes that are missing is exactly equal to whatever the Compressed size was.

I expected that if I compressed 4kb, and then decompressed it, I would have 4kb.
Note that I am working with buffers of fixed size, and that I have experimented with the Flags.

Backtracking the problem, I found that when I used acmStreamSize to suggest buffer sizes for both the compression and decompression streams, I saw the same thing.
If I asked what size Compressed buffer should be for input of 4kb, I am told 256 bytes, and if I ask what size Decompressed buffer should be for 256 bytes, I am told (4kb - 256) bytes.

Can someone please kick some sense into me, and/or verify this issue?
I'd be happy to provide more information such as the format specifications etc.

Posted on 2007-07-08 08:13:13 by Homer
Tried compress->decompress and comparing source and decompressed wave files in an audio editor? Or are things *way* off so this doesn't even make sense?
Posted on 2007-07-08 11:56:14 by f0dder
No I have not, I am not aware of any audio editor that allows me to select arbitrary audio codecs.
There's plenty of tools that perform conversions with specific codecs, such as wav-mp3-wav converters, sure enough.. but one that enumerates all installed codecs and their format specs?
If you can suggest such a tool, I'll give it a go, out of sheer curiosity.

Posted on 2007-07-09 05:24:11 by Homer

acmFormatChoose helps...

Why not talk in terms of "samples" for the PCM ?
Compare the original and resulting PCM (compressed then decompressed) - to see if there's a delay of 128 samples.
Posted on 2007-07-09 08:22:17 by Ultrano
I did not try 'GoldWave' yet.
But I did try acmChooseFormat, and verified that my PCM wave format was legit, and was supported natively by my audio hardware.
Debugging the returned WAVEFORMATEX fields, I noticed that this API forgets to set the cbStruct field, which I found amusing.

Anyway, I'm no closer to an answer, I am recording / playing audio as PCM 16 bit, 8khz, and wish to convert (compress) it under TruSpeech single-bit format.
My code contained a hardcoded version of the PCM format, and obtained a suitable TruSpeech format via enumeration.. both these formats are used together in various sourcecodes other than my own, and their fields all look ok to me, so in theory I could have hardcoded both wav format structs.

Note that I'm using DirectSound to perform the capture and playback, in order to benefit from its Buffer Notification system.

What seems most odd to me right now is what I said about acmStreamSize.
I said that the suggested sizes of the buffers for input and output of the two conversion streams differ.
The fact that they differ by the size of one truespeech compressed frame may be a coincidence.

This is driving me crazy.
Would you like to see the wav formats being used to create the two conversion streams?
Posted on 2007-07-12 01:26:00 by Homer
Guys, I threw together a simple demo to prove my point.
Given a hardcoded pcm wave format, this demo pokes around for an appropriate TrueSpeech format, then fires up two streams for conversion to and from TrueSpeech, and then queries for appropriate buffer sizes, and finally reports the sizes for raw input frame, compressed frame, and decompressed frame.
You can clearly see that there's an issue.
Please feel free to slap me, if you can see what's wrong here, as this code is quite similar to what I'm using, I've only 'de-oopified' since a lotta you guys don't like looking at such code..

If you can spot any issues, please tell me !!!

.model flat, stdcall
option casemap:none

include c:\masm32\include\windows.inc
include c:\masm32\include\kernel32.inc
include c:\masm32\include\user32.inc
include c:\masm32\include\ACM.inc
include c:\masm32\include\winmm.inc
includelib c:\masm32\lib\kernel32.lib
includelib c:\masm32\lib\user32.lib
includelib c:\masm32\lib\msacm32.lib

m2m macro dst,src
push src
pop dst

OurHadId dd 0
OurHad dd 0
hCompressor  dd 0
hDecompressor dd 0
VendorExtensionR db 32 dup (?)
CompressedFmt WAVEFORMATEX <>
VendorExtensionW db 32 dup (?)

szerr db "ERROR",0
szok db "OK",0
szerr1 db "Failed to find appropriate driver and/or format",0
szerr2 db "Failed to Open a Driver during enum",0
szlu db "Input 4000 - Packed %lu - Unpacked %lu",0

FormatEnumProc proc hadid, pafd:ptr ACMFORMATDETAILS,  dwTagWeWant,  fdwSupport
.if dwTagWeWant!=0
mov ebx,pafd
mov eax,.ACMFORMATDETAILS.dwFormatTag
.if eax==dwTagWeWant
;Note the format handle
m2m OurHadId ,hadid
mov eax, FALSE  ;  stop enumerating now
mov eax,TRUE ;  Continue enumerating.
FormatEnumProc endp

DriverEnumProc proc hadid, dwTagWeWant, fdwSupport
LOCAL had, dwVendor
  ; Open the driver.
    invoke acmDriverOpen,addr had, hadid, 0
    .if eax!=0
        invoke MessageBox,0,addr szerr,addr szerr2,MB_OK+MB_ICONERROR
        mov dwVendor,0
        invoke acmMetrics,had, ACM_METRIC_MAX_SIZE_FORMAT, addr dwVendor
        .if dwVendor < sizeof(WAVEFORMATEX)
        mov dwVendor , sizeof(WAVEFORMATEX) ;  for MS-PCM
        invoke RtlZeroMemory,addr fd,sizeof fd
        mov fd.cbStruct , sizeof fd
        mov fd.pwfx ,   offset CompressedFmt
        m2m fd.cbwfx , dwVendor
        mov fd.dwFormatTag , WAVE_FORMAT_PCM
        invoke acmFormatEnum,had, addr fd, FormatEnumProc, dwTagWeWant, 0
    .if eax!=0
            invoke MessageBox,0,addr szerr,addr szerr,MB_OK+MB_ICONERROR
        invoke acmDriverClose,had, 0
    .if OurHadId!=0
      mov eax, FALSE ;stop enumerating now
      mov eax,TRUE ; Continue enumeration.
DriverEnumProc endp 

SetupStreams proc
LOCAL dPacked,dUnpacked
local buf[256]:BYTE
;In this simplified demo, I have hardcoded my desired pcm wave format
mov RawFormat.cbSize,0
mov RawFormat.wFormatTag,WAVE_FORMAT_PCM ;Pulse-Coded Modulation
mov RawFormat.nChannels,1 ;mono, cuz stereo costs too much
mov RawFormat.nSamplesPerSec,8000 ;8 kilohertz sample rate
mov RawFormat.wBitsPerSample,16 ;* 16 bits per sample
mov RawFormat.nBlockAlign,2 ;#chans * bitspersample / 8
mov RawFormat.nAvgBytesPerSec,16000 ;samplesperSec * blockalign   

;This is what the TrueSpeech compressed format WILL look like
; mov CompressedFmt.wFormatTag , 34
; mov CompressedFmt.nChannels , 1
; mov CompressedFmt.nSamplesPerSec , 8000
; mov CompressedFmt.nAvgBytesPerSec , 1067
; mov CompressedFmt.nBlockAlign , 32
; mov CompressedFmt.wBitsPerSample , 1
; mov CompressedFmt.cbSize , 32

invoke acmDriverEnum,addr DriverEnumProc, 34, 0
.if eax!=0 || OurHadId==0
        invoke MessageBox,0,addr szerr1,addr szerr,MB_OK+MB_ICONERROR
    mov OurHad,0
    invoke acmDriverOpen,addr OurHad, OurHadId, 0
    .if eax!=0
            invoke MessageBox,0,addr szerr,addr szerr,MB_OK+MB_ICONERROR
; Open the Compression stream..    
    invoke acmStreamOpen,addr hCompressor,\
        OurHad,\ ; Driver handle
                        addr RawFormat,\ ; Source format
                        addr CompressedFmt,\; Destination format
                        NULL,\ ; No filter
                        NULL,\ ; No callback
                        0,\ ; Instance data (not used)
                        ACM_STREAMOPENF_NONREALTIME;  Flags   
    .if eax==0
; Open the Decompression stream..
    invoke acmStreamOpen,addr hDecompressor,\
        OurHad,\ ; Driver handle
                        addr CompressedFmt,\; Source format
                        addr RawFormat,\ ; Destination format
                        NULL,\ ; No filter
                        NULL,\ ; No callback
                        0,\ ; Instance data (not used)
                        ACM_STREAMOPENF_NONREALTIME;  Flags                   
    .if eax==0
    invoke acmStreamSize,hCompressor,4000,addr dPacked,ACM_STREAMSIZEF_SOURCE
    .if eax==0
                                    ;The commented line is an alternative , its returning the same value anyhoo
            ;invoke acmStreamSize,hDecompressor,dPacked,addr dUnpacked,ACM_STREAMSIZEF_SOURCE
        invoke acmStreamSize,hCompressor,dPacked,addr dUnpacked,ACM_STREAMSIZEF_DESTINATION    
        .if eax==0
                                invoke wsprintf,addr buf,addr szlu,dPacked,dUnpacked
                                invoke MessageBox,0,addr buf,addr szok,MB_OK+MB_ICONERROR
        invoke acmStreamClose,hDecompressor,0    
    invoke acmStreamClose,hCompressor,0    
    invoke acmDriverClose,OurHad,0
SetupStreams endp

start: call SetupStreams
invoke ExitProcess,0

end start

Posted on 2007-07-14 02:09:29 by Homer
Maybe there is something wrong with my setup (I just installed MASM32 v9 to compile your code), but:

fatal error A1000: cannot open file : c:\masm32\include\ACM.inc

What happens if you replace

invoke acmStreamSize,hCompressor,4000,addr dPacked,ACM_STREAMSIZEF_SOURCE

invoke acmStreamSize,hCompressor,4096,addr dPacked,ACM_STREAMSIZEF_SOURCE

Posted on 2007-07-14 04:14:29 by Frank
Whatever size frame you specify is not relevant, the output frame is smaller than the input frame, thats the point.

I can understand that one page is a neater value, I chose 4000 because its an even segment of a 16000 byte buffer, the pcm wave format I'm using is 8000 samples per second, at 16 bits (two bytes per sample), giving 16000 bytes per second, so 4000 is one quarter of my one second buffer, thats why I used that, but its not important, this demo has no actual buffers, I am only showing that the codec is suggesting that the output buffer for decompressed data is smaller than the input buffer for uncompressed data, and this is in terms of pcm wave data, where size is critical to the rate it plays at, so you're effectively losing data, losing time... this is not acceptable , it seems quite odd to me, I can understand a loss of QUALITY but not of QUANTITY...
Posted on 2007-07-14 14:51:15 by Homer