I've improved the implementation of MP3 Streaming, eliminating 'popping/glitching' artifacts, while still using the Event Notification scheme to refill half my SoundBuffer while the other half is Playing.
The trick was in dealing with 'overflow condition', where we decompressed more wav data than would fit in our 'halfbuffer'... I'm now caching this extraneous data in-situ, consuming it as soon as we're ready.
This solution is ideal as there is no 'double buffering' / moving data unnecessarily, and I've not been forced to march down the Microsoft MVP path of doing my own polling via GetCurrentPosition (what a god awful solution they proposed).

Now all my MP3s play cleanly, I can get back to more important stuff.
Posted on 2009-11-30 00:03:41 by Homer
Attached is a Binary Executable which implements a Player supporting realtime streaming of WAV and MP3 files.
PLEASE TEST, try a few different mp3 files, and tell me whether you can hear any problems in the output!

I have a feeling that occasionally, the SoundBuffer Notification mechanism breaks down, causing my 'onWantData' method to be called repeatedly as quickly as possible, rather than once per second  (my playbuffer is two seconds long, I try to fill one half while the other half is being played).

Latest change was to add EventManager methods for Unregistering events (by name, and by handle), and adding code to StreamingSound.Done to unregister its own notification event, preventing a leak of one Event Object every time we Unload a StreamingSound (or StreamingMP3Sound).
Also added a CriticalSection (shared across the audio engine) which is utilized by the 'onWantData' event handler - this ensures that we cannot Unload a Sound whilst a notification event is being handled (would cause a gpf).

NOTE: The attached binary does *NOT* contain these recent additions, so its possible to trigger a GPF by trying to play a subsequent soundfile, or simply by trying to close the application, while a soundfile is currently being played.

Attachments:
Posted on 2009-11-30 04:00:05 by Homer
Found and fixed a problem introduced somehow into WAV Streaming.
Located sourcecode example of OGG VORBIS decoding with respect to DirectSound.
Also located a binary DLL providing ogg encoding/decoding functionality.
Would much prefer to obtain the LIB and link it into my audio engine, anyone care to build the most recent version and make the LIB available?

EDIT: Found a vorbis ACM codec YAY
Posted on 2009-11-30 08:14:30 by Homer
Spoke too soon - found and fixed another WAV streaming bug, here's another test build, expect problems at the start or the end of an mp3 track, wav files should be fine, and the filesize limit is 32 bits currently.

Please test this build for stability, and this time please tell me some feedback, even if its just a "yeah it works".

I've begun to support ARBITRARY ACM CODECS !!! Over 100 possible codecs :D I won't be able to provide arbitrary file format support, but I can at least open the door to all these codecs for transcode purposes!  8)
Attachments:
Posted on 2009-12-01 04:24:27 by Homer
The attached file is dr watson's crash log.
How to reproduce: Just keep clicking "play a local audio file" button and opening random mp3's. Eventually, after about 10th opened mp3 the app will crash.
This happens both on limited and admin accounts.

One comment, if I may: DirectSound is deprecated on Vista and dropped on 7. MS says that developers should use Xaudio2/XACT. (though I personally recommend OpenAL ^^).

And 1 question, if I may: Could you please elaborate on MSACM? It's very poorly documented. I remember that I tried using it several years ago for decompression but managed to make it work only with few audio files (most probably I was doing something incorrectly). So a thorough explaination/tutorial would be appreciated.
Attachments:
Posted on 2009-12-01 13:41:06 by ti_mo_n
Thanks for the feedback!
Please read notes in previous post in regards to this GPF - its a known problem and has already been resolved.

Coincidentally, I spent the entire day today searching for information on XACT/XAUDIO2... and here I will present my findings.

With respect to XP, its like this:

XACT->XAudio2->DirectSound->WaveIn/WaveOut

And under Vista/Win7, its like this:

XACT->XAudio2->SOFTWARE MUX->Windows Core Audio->WaveOut
AFAIK DirectSound is available on Win7, but DirectSound HARDWARE ACCELERATION is not... read on!

First thing I need to say is that XAudio2 DOES NOT SUPPORT CAPTURE - Period.
Second thing is that, at least under XP, this is JUST a wrapper for DirectSound - in fact, all XAudio represents is a software mixer that spits wav data out to a single DirectSound buffer.
Third, Vista uses a 100% SOFTWARE audio stack which supports EMULATION of DirectSound - software on software with no chance of hardware acceleration.
This fixes problems caused by bad drivers from oem audio card manufacturers, but effectively means no hardware acceleration is possible... DirectSound will still work on Vista, but suffers the double indignity of a software-on-software base layer.

So anyway, I tried playing around with microsoft's xact demos on XP to see if they suffer the same DirectSound problems I am currently coping with (which I will elaborate on shortly) - and they don't, it works nice, despite the fact that, on XP, DirectSound is still the base.
So I looked closer at the XAudio2 API and I quickly realized how they did it.

The problem I'm having, which is a fairly well known problem, is related to the Audio Event Notification interface, which I use to track the current position of directsound io buffers - it turns out that, even if a soundbuffer is Stopped, it can generate audio event notifications CAUSED BY SOUNDS BEING PLAYED IN ANOTHER PROCESS !!!
For me this is terrible - I have a two second soundbuffer, with two notification keys at the halfway and end of the buffer - when the play cursor reaches a "halfbuffer", I fill the OTHER halfbuffer with more WAV data - so if I get untimely notifications generated by some other process, my audio will go out of synch and glitch like hell.... fire up two instances of my player to see what I mean.
The Notification scheme is simply too broken (under most audio drivers) which was probably the reason they deprecated DirectSound... it was a case of "we cant trust these audio guys to get it right, so we'll do everything in software and then palm it to a vanilla audio driver". Say goodbye to Environmental Audio Effects!

Microsoft MVPs propose that the solution is to not use DS notifications at all - instead, to provide a thread which POLLS the playbuffer, querying its position every 10ms or so - fine if theres only one soundbuffer, bad if theres a few... the idea is to periodically write *SOME* data to the soundbuffer, and my understanding is even this solution has inherent problems... so I am lead to believe anyway.

Turns out that the XAudio2 api has a method called something like "DoWork" which we should call FREQUENTLY, I believe this method is doing that polling I mentioned, as well as generating soft notifications (rather than letting the audio driver do it).

I think I can do something similar over DirectSound which will:

A- solve my current problems wrt notifications
B- work on XP, Vista and Win7
C- allow CAPTURE of audio under a common audio framework
D- allow ACM-based streaming under any installed codecs (microsoft chose not to provide mp3 streaming examples in the DirectX SDK despite the fact that all these operating systems ship with a suitable codec - why do think that is?)

I cannot believe that the DirectSound emulation layer on Vista is THAT slow - I say its the software mixer layer, which we're stuck with regardless. I think that microsoft is PUSHING developers onto XAudio2 with a scare campaign. We all know how much Microsoft wishes to dominate the game development market, as seen with their push toward developer-oriented products such as XNA. Maybe they should stick to their day jobs.

So - I will continue working under DirectSound for the immediate future!

I've actually written about ACM in a previous post on this board, what would you like to know about it? Basically, the Audio Compression Manager api gives you access to every single audio codec installed on your system, and lets you use them compress/decompress 'frames' of audio data, and if necessary, you can CHAIN codecs together (we can use intermediate codecs in cases where a codec can't directly produce what we want).
It's VERY easy to use, especially if you want to convert an entire file at once rather than streaming frames of data... although there are a couple of tricks worth knowing. The main one is about determining the size of your two buffers - assuming you want to decompress MP3, you can query the ACM for a suitable size buffer by nominating the size of the other buffer - say, "how big should my wav buffer be if the mp3 buffer is 2k in size" - the trick is that once you get the answer, you should query AGAIN for the OTHER buffersize, this time nominating the buffersize it told you in the first query - "ok, so if I have an 8k WAV buffer, how big should the MP3 buffer be?" - the value it returns will usually be different to the original value you handed in! It might tell you "use an mp3 buffer of 2304 bytes"... now we have a matched set of buffer sizes which were calculated AROUND the original value, and we're ready to rumble. 99% of the example sourcecode you will find DO NOT DO THIS - they use special magical predetermined values such as "MP3_BLOCK_SIZE" which work for a SPECIFIC BIT RATE, but wont work for all cases... seems everyone copied the same broken examples.

Anyway I'll spend a day or two on design considerations before I go ahead, and I'll post another demo very soon.
Posted on 2009-12-04 07:52:05 by Homer
With OpenAL you can get hardware-accelerated audio under XP, Vista and Win7. It completely bypasses the whole XACT/XAudio2/DirectSound stack and is a solution which talks directly to the audio driver.
I think it's worth checking out before you get in too deep with another solution.

As for mp3s, buffersizes and bitrates... Don't forget that there are also variable bitrate files, so at some point in the file the bitrate might go up, requiring larger buffers.
When variable bitrate mp3s first started appearing, cleanly designed players could handle them with ease, hardcoded stuff would break :)
Posted on 2009-12-04 08:24:13 by Scali
My existing code does examine each MP3 frame header to detect Variable Bitrate, but Im yet to find any MP3s in my collection that uses this.
AFAIK ACM is VBR safe , the sizes it calculates are maximums for a given size frame, at maximum bitrate... after decompressing a frame, you still need to look how much PCM/WAV data was emitted.... and even with a FIXED BITRATE, this size WILL vary from one frame to another, so we cannot use any constants.

My understanding is that Creative Labs is close to collapsing, and that OpenAL is effectively a dead duck waiting to be roasted. I won't be jumping on that bandwagon until I am convinced otherwise, although I will be watching closely! Theres also something else that redirects DirectSound calls to OpenAL, which then figures out what to do with them based on the local machine, forget what thats called, but if I choose to eventually use OpenAL, it will be likely that I'll do it that way.
Posted on 2009-12-04 08:35:31 by Homer
Well, most mp3 encoders can encode VBR, so it's easy to create a test file.
I can send you a VBR mp3 if you want.

As for Creative collapsing, I've not heard anything of the sort. And even if Creative were to collapse, I'm not sure if that would mean the end of OpenAL. There's quite a few games that use it. Developers will probably want continued support of OpenAL one way or another.
It wouldn't be the first time either... OpenGL was originally developed by SGI, but it was opened up, and eventually control was transferred to the Khronos group. With OpenCL, Apple moved to Khronos directy, merely providing them with a draft of the standard.

The DirectSound-to-OpenAL redirection (ALchemy) is one of Creative's driver features. They needed it to continue supporting hardware-accelerated DSound and their custom 3D effects and post-processing (EAX) that is supposed to be one of the main selling points of their soundcards over onboard audio.
Posted on 2009-12-04 08:53:05 by Scali
Sure, I'd love to try a VBR mp3 file and make sure my code handles them as expected - if you create one, can you please make it an extreme example? say, an 8khz stream interleaved with a 192k stream or something.. and make sure that the first few frames use the lower bitrate of course.
Posted on 2009-12-04 09:06:18 by Homer
The main one is about determining the size of your two buffers - assuming you want to decompress MP3, you can query the ACM for a suitable size buffer by nominating the size of the other buffer - say, "how big should my wav buffer be if the mp3 buffer is 2k in size" - the trick is that once you get the answer, you should query AGAIN for the OTHER buffersize, this time nominating the buffersize it told you in the first query - "ok, so if I have an 8k WAV buffer, how big should the MP3 buffer be?" - the value it returns will usually be different to the original value you handed in! It might tell you "use an mp3 buffer of 2304 bytes"... now we have a matched set of buffer sizes which were calculated AROUND the original value, and we're ready to rumble. 99% of the example sourcecode you will find DO NOT DO THIS - they use special magical predetermined values such as "MP3_BLOCK_SIZE" which work for a SPECIFIC BIT RATE, but wont work for all cases... seems everyone copied the same broken examples.

I haven't checked it yet, but I think this is exactly what I was doing wrong. This would explain why my code worked with some files and didn't with others. Thank you ^^

OpenAl is now now ported on consoles, so I don't think it's going to die even if we assume Creative's collapse (which I haven't heard anything about). And OpenAL is easier to use (you can get pretty much any audio effect you want by using like 10 functions). The attached files are my test app which enumerates openal drivers (I can attach the source if anyone wants) and a screenshot showing how it works on Sound Blaster X-Fi (for those who don't want to download and run an exe). There was an update to OpenAl this summer (some bugfixes and enhanced driver enumeration) so it looks like it's really being developed (i.e. the project is not dead).



I can send you a 6MB mp3 with bitrate jumping from less than 100 kbps to more than 500 kbps.
Posted on 2009-12-04 09:07:08 by ti_mo_n
500k!! lol !!! sure, send it! :D

Here is my preliminary work from this morning.. my version of the "XAudio2::DoWork" method ... since my soundbuffers are always a healthy 2 seconds in size, I should be able to call this around once per second, with plenty of room for lag caused by thread scheduling etc - not once every 10ms like Microsoft's model requires. Note that this method has been implemented at the Buffer abstraction level, I'll need to call this for all sounds currently being played.

NextReadOffset is the offset into the source data - in this example, a large filemapped wav.
NextWriteOffset is an offset into the SoundBuffer where I will be writing my data (initially zero).
Worth noting that since I use Debug version of DirectX that I have to ensure my SoundBuffer is initialized with silence before I start it playing.

By the way - your application works fine, and enumerates EAX support, does this imply that I have OpenAL binaries installed without my knowledge?


;Method:    StreamingSound.Advance
;Purpose:   Fill the SoundBuffer *BEHIND* the PlayCursor
;Remarks:   Must be called about once per second to maintain audio quality
Method StreamingSound.Advance,uses esi
LOCAL playpos,writelen
   SetObject esi
   
   ;If we run out of data, stop playing
   mov eax,.dNextReadOffset
   .if eax>=.dAudioSize
     DbgWarning "Sound Data Exhausted - Stopping"
     OCall esi.Stop
     ExitMethod
   .endif

   ;Find out where the PlayCursor is within the Buffer
   ICall .pDSBuffer::IDirectSoundBuffer.GetCurrentPosition,addr playpos,NULL
   .if eax==S_OK
       ;Calculate how much data we can write
       ;(after our last write offset, and before the playcursor)
       mov eax,playpos
       .if .dNextWriteOffset < eax
           sub eax,.dNextWriteOffset            
       .else
           mov edx,.dNextWriteOffset
           sub edx,eax
           mov eax,.dBufferSize
           sub eax,edx
       .endif
       ;Write as much data as possible to the Buffer
       push eax
       mov edx,.pStreamingData
       add edx,.dNextReadOffset
       OCall esi.WriteBuffer, edx, eax, .dNextWriteOffset
       pop eax
       add .dNextWriteOffset,eax
       add .dNextReadOffset,eax
   .endif

   ;Constrain the Write Cursor to the Buffer range
   mov eax,.dBufferSize
   .if eax<.dNextWriteOffset
       sub .dNextWriteOffset,eax
   .endif

MethodEnd


Posted on 2009-12-04 20:44:27 by Homer
Good news - I just tested this code, hammering the Advance method from my Windows MessagePump loop, and it works perfectly - even when another process generates sounds.. I'll experiment with knocking on it once per second from another thread, and if all goes well, I'll write the StreamingMP3Sound version, and set up a thread to drive all currently playing sounds rather than polluting my application framework.

Two thumbs up to me, so far I think.
:thumbsup: :thumbsup:


EDIT:
Works fine as long as you keep generating windows messages by moving the mouse or whatever - driving from the WM pump under GetMessage is not gonna cut it, but a PeekMessage based pump, or a thread with a timed loop will be fine.
Attached is a DEBUG build using the new method for streaming WAV files - feed it a big fat wav and wiggle your mouse while its playing, should be cool, noting that we are calling the Advance method WAY more often than we need to.
I use DebugCenter to catch message string from my applications, you'll need to install that if you wanna test this.
Go to the ObjAsm website, download the standalone DebugCenter executable, put it somewhere permanent, and run it once to register its filepath - then you'll be able to see my debug strings - otherwise put up with a nag message and it should still work regardless.

Attachments:
Posted on 2009-12-04 22:39:30 by Homer
It will probably crash at end of data - issue resolved here, but not in the posted demo.
Would really like some Vista and Win7 results !!
Posted on 2009-12-04 23:21:20 by Homer
More good news - implemented the MP3-streaming version, which works perfectly !! Will probably move the stream decompression code into its own method, so we can effectively stream using any codec we have installed on the system.


Posted on 2009-12-05 21:06:27 by Homer

Good news - I just tested this code, hammering the Advance method from my Windows MessagePump loop, and it works perfectly - even when another process generates sounds.. I'll experiment with knocking on it once per second from another thread, and if all goes well, I'll write the StreamingMP3Sound version, and set up a thread to drive all currently playing sounds rather than polluting my application framework.

Two thumbs up to me, so far I think.
:thumbsup: :thumbsup:


EDIT:
Works fine as long as you keep generating windows messages by moving the mouse or whatever - driving from the WM pump under GetMessage is not gonna cut it, but a PeekMessage based pump, or a thread with a timed loop will be fine.
Attached is a DEBUG build using the new method for streaming WAV files - feed it a big fat wav and wiggle your mouse while its playing, should be cool, noting that we are calling the Advance method WAY more often than we need to.
I use DebugCenter to catch message string from my applications, you'll need to install that if you wanna test this.
Go to the ObjAsm website, download the standalone DebugCenter executable, put it somewhere permanent, and run it once to register its filepath - then you'll be able to see my debug strings - otherwise put up with a nag message and it should still work regardless.

omg i jumped off my chair when i tried this with a mp3 file, was just loud static sounding.
Previous versions have worked ok, produced sound that you can listen to (but it wasnt playing the mp3 file exactly how it should have) but using the same mp3 in this version was just loud static. Im using win xp
Posted on 2009-12-05 21:30:23 by Azura
Yes, that version was testing a new method of filling the buffer (workaround for a bug in DirectSound), as mentioned I wanted to get WAV streaming working under that model before I went ahead with MP3 - which ive now done! (did you notice you can change the FILETYPE?) This workaround should correct all problems in MP3/WAV streamed playback.

Attached is a new demo - this one has its own dedicated thread for updating all currently playing sound streams - since it only has to advance each stream once per second, it seemed unreasonable to use one thread per stream, so I'm using a single thread to manage all the currently playing (streaming) sounds at once... ok this demo only lets you play one at a time, but the engine does support any number of simultaneous sound streams, in 2D and 3D.

You may notice a 2 second delay before you can hear anything - this is due to me not priming the soundbuffer with data before playback commences, and can be eliminated...

NOTE:
In fact the previous version should have played SILENCE when you loaded an MP3 - the fact that it didnt is a behavior caused by using DirectX runtimes in Debug mode, or explicitly binding to debug libraries (which I didnt). Either way, it won't happen anymore :)



Attachments:
Posted on 2009-12-05 21:34:54 by Homer
Interesting times, Biterider found a few files that won't play.

One of them is a WAV file which contains MP3-compressed data.
I didn't even know WAV files could do that!

The other two are small MP3 files which generate an error when I try to decompress them.
I gather something's weird in those files and they will require a closer examination.

In regards to compressed wav files, I think I'll be able to twist things so that I can play those, but it might require that I move my 'acm decompression stream handle' from the main engine into each compressed sound object (as I did with the 'acm decompression stream header'), IE, I'm pretty sure that the decompression stream will need to be initialized using the exact same values as parsed from the extended wave format header.

Currently I decompress all mp3 files using the same shared decompression stream object, using a fixed set of values for 'preferred format of wave playback'...

So, it might not be a simple thing, but it kinda irks me that I can't handle these files.
Can anyone provide me with a WAV file containing data compressed by another codec?
I would love to try to write a unified solution that sets up the codec based on nothing more than the tag, although that may not be practical.

Anyway, to handle mp3-compressed WAV files, I need to do the following from my WAV loader:
A - Detect that the WAV is indeed MP3 compressed (completed)
B - Initialize a StreamingMP3Sound container, handing it the size of the compressed data (size obtained from extended wav header)
C - Possibly create a special ACM Decompression Stream object for this special stream
D - Return the MP3 container INSTEAD OF the StreamingSound object I'd normally be returning

The application shouldn't care whether the returned object is a compressed stream, uncompressed stream or just a static uncompressed sound, I'm using OOP to redefine (override) the appropriate methods, so calling StreamingSound.Advance apon a StreamingMP3Sound object will in fact result in a call to StreamingMP3Sound.Advance :)

If I leave out Step C (for now, just to speed up the development time), I should be able to implement code to play these "mp3 compressed wave files" without having to abstract them away under a new class definition - just get my WAV loader to go "ooh, thats not raw pcm wav data, and I can handle mp3 streams, so treat the WAV data as mp3 data, please!"

Note - eliminated the two second delay for streaming wav/mp3


Posted on 2009-12-06 01:56:26 by Homer
Figured out why those MP3s refused to play - they use a nonstandard (not 44.1k) sample rate.
Most codecs are not great at sample rate conversions.
The ACM can do it, but only by using an intermediate conversion stream, eg

22k MP3 -> 22k WAV -> 44.1k WAV

The alternative is to use a separate conversion stream for each concurrent sound stream.
EG, to set the playback sample rate in each Secondary Buffer to whatever the source sample rate is, and let DirectSound deal with the samplerate conversion when it mixes them into the primary buffer.

So it looks like I'm going to need to move that conversion-stream object (currently shared across all sounds) into each, which I suggested in the previous post.

Posted on 2009-12-06 04:13:37 by Homer
Today is officially a no coding day, I need one now and then.
Perhaps tomorrow I'll start looking at dealing with outstanding issues, theres not many now, and soon this audio engine will be out of beta, and the update published to the OA32 library.
Posted on 2009-12-06 21:37:31 by Homer