Heya :)
Please test the attached application and let me know if its stable.
It's a UDP based VOIP (voice over ip) thingy, with kinda crap RLE compression which works great when you're compressing silence, lol.. anyways, its set up for 8khz, 8 bit audio.. it Sends, but it doesn't Recv yet, this is just a Send test ok guys :)
Those of you who have DebugCenter installed will see runtime debug spew..

I'm thinking about expanding it to compress the "uncompressible data" to around 50% by encoding delta value bytes as nybbles, when the waveform delta is 4 bits or less.
This should give me a reasonably fast lossless compression algo that yields reasonably decent compression rations for WAVE input data.

Make sure to Set the Host before you Talk :) (you can change it again if you like)
If you have problems, please tell me what your audio hardware is.

Note : you can Lock the Talk button by 'left clicking it, then sliding off it before releasing the mousebutton"


Attachments:
Posted on 2006-04-29 10:47:03 by Homer
gives me something like "masm32 path not found" and "debug center not found" and then crashes.

About the compression: Use The audio compression manager and let the user chose his/her favorite compression method. ...Or choose yourself. I suggest using GSM6.1 or CiTT u/a-law.

The attached file is a little demo app showing how to (de)compress a wave using MS ACM. the source is quite unclean, but you can get the idea of how things work. Afetr every compression the app shows some messageboxes. They're for testing purposes, so just ignore them.
Attachments:
Posted on 2006-04-29 10:59:53 by ti_mo_n
Thanks - I'll read that source tomorrow and implement the ACM stuff.. I don't have fileheaders, I'm dealing with raw wav data in a known format.. a stream sample would have been more appropriate, but I'll figure it out :)

In regards to the crash - I can't say what went wrong because DebugCenter is not installed :)
DebugCenter is an executable which my beta demos interact with, basically it's just a window to capture debug messages in a separate process.
You can obtain the DebugCenter component of OA32 as a separate download from Biterider's site : http://objasm32.tripod.com
Just put the small executable someplace safe and execute it (from there) ONCE to self-register it.
After that, it will be launched automatically by any beta apps that have debug spew..



Posted on 2006-04-29 11:28:03 by Homer
I have debug center, but it crashes saying that it can't find masm's path :P

The attached file is acm.inc with all the stuff regarding MSACM in it. It's in TASM syntax, but it's quite straightforward, so it shoudn't be difficult to translaate.
Attachments:
Posted on 2006-04-29 12:08:59 by ti_mo_n
Hi ti_mo_n
It is possible that you are using an old DebugCenter release? Try this one... If you still have problems, write down the text in the messagebox please.

Biterider
Attachments:
Posted on 2006-04-29 13:20:55 by Biterider
Thank you - it works now. Homer's app displays some text about buffers being 'good to go'. It also displays some info when I press "talk". Trying to connect anywhere fails with either "bad host" or "failed bind" (probably becuase there is no host for this app?).
Posted on 2006-04-29 13:25:20 by ti_mo_n
That's right - ignore the "failed Bind" message.
Binding is not at all necessary for SENDING packets..
( you should have no issue Binding to local addresses such as loopback ;) )

The other one, "Bad Host", indicates a DNS lookup failure (ie, failed to resolve hostname to ip).. this is kinda bad.

Either way, pressing that button (re)creates and initializes the Socket for udp, so nothing will be sent until you do this first (you will see "transmission error" in the debug spew)

Currently the application does not handle partially incomplete sends well at all - it does detect them, but does nothing to correct this situation.
Similarly, packets are not being marked with a Counter value to deal with out-of-order reception, although the existing custom RLE compression algorithm does allow us to identify incomplete packets.. the protocol will likely be defined within the context of a tutorial aimed at beginners in network programming (theres no protocol yet per se)

I was mostly interested in the stability of the application - ie, does the application crash under load? (How long can you send packets before it falls down? DOES it fall down?)

I'll be looking into the ACM stuff shortly, it's been a busy day.. perhaps I will repost today, perhaps not..
Posted on 2006-04-30 02:59:27 by Homer

Heya - I've updated the previously attached zipfile.
The new version will search through the installed codecs looking for TrueSpeech support.
I'm led to believe that TrueSpeech shipped with everything from win95 onwards, so I am expecting no problems... let me know if there are !!

(note : I haven't implemented the compression yet, just a search for a particular codec..)
Posted on 2006-04-30 11:48:12 by Homer
It prints a lot of text and ends with "codec not available". There is a TrueSpeech in this 'log'. Please see the attached output.
Attachments:
Posted on 2006-04-30 13:16:59 by ti_mo_n
I see.. I was searching for the codec by the Long Name.
My log says "Long name:  DSP Group TrueSpeech(TM) Software CODEC"

I changed the code to search for the first codec that supports FormatTag 34 (TrueSpeech).
Also, I changed the Capture audio format from 8bit, 8khz to 16bit, 8khz.
This means our one second recording buffer is now 16000 bytes.
There's still 16 notification positions distributed across the buffer.

Having made that change to the capture format, I was then able to create two ConversionStreams (one to compress wav-->truespeech, the other to decompress truespeech-->wav).

I've updated the zip again for the changes outlined above.
Other changes include the removal of the custom compression code, and the disabling of the SendTo code .. you can ignore the 'Set Host' button, because pressing 'Talk' will simply show the accessing of "chunks of 1/16 of the capture buffer" (via the notifications mentioned previously).

All that remains now is to add some calls to perform the streaming conversion of the accessed portions of the buffer, and then write the "playback" side (create playback buffer, access portions of it, etc)
Posted on 2006-05-01 04:33:21 by Homer

Added code for Playback.
We no longer Start and Stop Capture with the button - it toggles a flag var instead.
The reason is that we can't set up Notifications on a Playback buffer.

The capture and playback buffers are created together, of the same format (pcm-wave), and of the same size.. then they are both started running, and then they basically never stop. I had hoped that I would be able to get them close to synchronized, and that they'd move together, but the reality turned out to be otherwise.. more work.

The periodic notifications are our queue to grab the current hardware offsets in both buffers and access the buffers 'safely' (ie, avoiding the region that the hardware is using in each).

The new boolean flag, bIsRecording, allows the notification processing loop to 'ignore the capturebuffer for the period of the current notification'.

Updated the zip again :)

Posted on 2006-05-01 12:00:53 by Homer
To set notification positions on playback buffers, the buffers must be created with DSBCAPS_CTRLPOSITIONNOTIFY.

> DSBUFFERDESC
Posted on 2006-05-01 12:11:40 by ti_mo_n
Ok, cool :) ... but it's not actually worth doing - we just need periodic servicing of the buffers, the Capture notifications can serve both buffers .. either way, we still should check the current hardware positions within the notification handler(s), so I'm not seeing any benefit from having two sets of notifications at all.. unless you wanted playback and capture to be managed by separate threads, which in my mind is even worse, ie, extra overhead and complexity for no logical reason.. agree?
Posted on 2006-05-01 12:56:19 by Homer
The play buffer can lag in some situations. You can't assume that both buffers are in the same play position just because they were started at the same time. The main reason for the lag is the 'start lag' - it's a few miliseconds from one 'play' command to another. The other lag comes from refilling the buffers - the memory must be copied from 1 place to another. Third lag occasion is during mixing: software mixing may mix 1 sound and then procees to mix the other one (while the first one is already playing). It's not the case on modern systems, because they're very fast, but if you want to me "mr. proper, clean & clear" then you should give every buffer its own notification positions and act on them separately (threads strongly recommended).
Posted on 2006-05-01 13:24:14 by ti_mo_n
I've succeeded in implementing what I described - ie, sharing the notifications, thinking of them only as a periodic firing mechanism for the code which services BOTH buffers.
The updated demo shows the locking and unlocking of the Play buffer all the time, and the locking and unlocking of the Capture buffer as well , should you press Talk.

The memory accesses are sequential, in equal divisions of time regardless of their lag... but the size of the accesses depends on the lag, for we access all the memory "from our last access address to the current hardware pointer", always working behind the region that the hardware is accessing in both buffers.

We can play and record at the same time under a single thread.
It is less complex, it uses less resources, and it performs exactly the same tasks.
Posted on 2006-05-01 13:40:05 by Homer
OK. I didn't know you do it THIS way ;)
Posted on 2006-05-01 13:44:05 by ti_mo_n
Nice little idea you have here...
I will test it out for you i am running Microsoft Windows Xp Professional, 98 , 2000 .and some others on my home pc's
Posted on 2006-05-01 19:09:38 by COREY
Thanks, I appreciate it.. you'll need DebugCenter installed to see what's happening at runtime.
I'm not expecting any platform-specific problems, but you never know..
Posted on 2006-05-02 00:41:24 by Homer
I've posted another update.
This version compresses the Captured audio data.
If you study the debug spew, you will see stuff like :
dwSize1 = 2428t, Actual #Bytes Locked
pCaptured = 008DD556h
ashc.cbDstLengthUsed = 160t


Here we can see that a chunk of input wave data of 2428 bytes was compressed to 160 bytes :)

In this version, I had to deal with a new special case.. TrueSpeech was refusing to compress the input data if there was less than several hundred bytes..

This happens regularly when the Capture hardware pointer wraps past the end of the buffer, leaving us to read "from wherever we were last time to the end of the buffer".
My solution was to handle insufficient data as follows:
If theres less than 512 bytes available, copy it to a temp buffer, and force our "last read offset" to zero (the start of the buffer)... then, in the next iteration, append any further available data to the tempbuffer, and compress the tempbuffer.

It seems to work just fine :)

I'm now ready to start Sending my compressed data again, but before I do that, I'll start setting up the RX (receive) networking code (for the playback side of the app).
As soon as I'm ready to both Send and Receive, I'll tidy up the source and post it complete in the Networking section in a new thread entitled something like "Voice Protocol Design" :D

Posted on 2006-05-02 05:02:40 by Homer
OK, I'm now loopback testing .. Sending and Receiving the compressed data, and compressing and decompressing it and stuff like that...

I had problems with storing decompressed data in the PlayBuffer, but I've come up with a solution to the problem which involves marking the Sent packets with the buffer offset at which they were recorded.
When we receive and decompress some data, we write it to the playbuffer at the offset indicated in the packet if it "safe to do so".. otherwise we cache the packet until the next Notification occurs, by which time it should be safe to store the packet.
There is a 'danger zone' that we should avoid accessing in the buffer space.
It begins at the current position of the buffer's Hardware cursor.
Where it ends depends on the audio card you are using.. I'm going to define a "danger zone size" of one and a half Notification timedivisions so as  to be 'extra safe' :)
If the Offset indicated within a received packet lays within the 'danger zone', we'll cache the data by adding it to a DataCollection ;)
Each packet is being received into a heap-allocated memory object that is perfectly suitable for storing in a DataCollection.
Each time a Notification fires, we'll just try to empty the DataCollection.
Basically all this means that:
- packets arriving out of order are irrelevant since we reconstruct the buffer with what is effectively a time-marker.
- we can handle more than one packet at a time if we need to, and this is important because we can expect weirdness and handle it.. I'm thinking ahead a little toward sinking multiple sources and mixing them locally, but anyways.. mixing is something I'm yet to look into.

I haven't sent a new update yet because I want to try implementing what I just described first..

Something I noticed which is interesting is that the size of a chunk of data gets smaller when it is compressed, then decompressed.. this worries me, since this is going to create some 'gaps' in the playbuffer :( I was under the impression that I might lose some quality, but it seems unacceptable to me that I should lose Size, where in a pcm-wave, the size is critical since it is so strongly associated with Time.. I am predicting a 'machinegun' kind of audio artifact created because there's '16 gaps per second' :(

Anyone care to comment?

Posted on 2006-05-02 13:36:36 by Homer