After searching this board as well as google in general, i didnt find anything for doing FFT and IFFT with MASM. So i decided to transcribe a source i had from Numerical Recipies in C. Here is the transcription, with a test program.

There is a bug in it tho, and its driving me crazy trying to find it. I hoping someone can give it a 'fresh eye' and hopefully spot my mistake. Its well commented, and i have the origional source above in the file (commented). I transcribed it in-file with a split views on (made it quite easy to do).

I have also provided a test program. It uses a series of data points (found in fpdata.inc) as its input. It doesnt crash (so its a nasty bug) and it produces normal numbers, only they are wrong.

I have also appended in fpdata.inc the correct answers as per MatLab's FFT.

Thanks for your help!
As well I plan to post this for optomization by all you guru's who ache for something new... if you do get it figured out, feel free to optomize it if you like. My goal is to hopefully make it another addition to the MASM32 LIB.

Here is a plot of the correct output responce (real values only):
Posted on 2003-05-11 23:37:21 by NaN
And here is the file and source...

If you do figure it out, tack on you name at the top so credit can be given in the final release.

Thanks again!
:alright:
NaN
Posted on 2003-05-11 23:40:22 by NaN
If anyone knows what im doing wrong here.. i would love to hear your thoughts... I plotted its output, and its similar in layout, but values are not correct. As well there is 'ringing' developing in the math (numbers are oscillating) and become more unstable as it gets further down the data array. I believe its this ringing that is faultering the output. However i can't see where it would be comming from. My only guess is its round off errors in the fpu math? (Im not an expert at such things, but it thought it did it correctly)..

:confused: NaN :confused:
Posted on 2003-05-12 23:46:06 by NaN

If anyone knows what im doing wrong here.. i would love to hear your thoughts...

Hmm-- FFT as in "Fast Fouries Transform"? Something with wave forms (audio), right? I'll see if I can take a look at it, but have little (almost nonexistant) knowledge of Fourier Transforms.
(well, that's about it (my thoughts before rushign to school))
Posted on 2003-05-12 23:55:30 by scientica
NaN, post a graphed example of your incorrect results, too?
I don't have any experience with FFT, but the graph might give an idea what's wrong.

Are you remembering to finit? Are you setting up the FPU rounding mode?
Posted on 2003-05-13 02:05:35 by f0dder
NAN - I only just scanned your code , but I notice in the SWAP routine you seem to be duplicating the 1-base array assumption of the original code, yet your data is zero-based.

Thus, when "i" = 1 (for example), you're really referencing the second array element when the original source wants to reference the first element.

Suggest adjusting the index or basepointer
Posted on 2003-05-13 05:51:08 by TheoMcCloskey
Thanx i will look at this indexing your talking about. However, i thought C is 0 based indexing? Only when you define the length do you make it 1 based in size...

f0dder, here is the plot you asked for:
Posted on 2003-05-13 17:46:55 by NaN
Actually it is working, well kinda. I was miss-reading the data. The ringing is because i was taking in the phase as well as the amplitude as an amplitude data set...

Below is a plot of the same data with every other point taken (even points). However, its not aligned correctly and has different peak amplitudes, so maybe there is an index problem of some sort?? Or maybe its the algorithm from Numberical Recipies in C. Still open to suggestions/comments!
Posted on 2003-05-13 18:07:01 by NaN

However, i thought C is 0 based indexing? Only when you define the length do you make it 1 based in size...


You fell into the nasty trap of Numerical Recipes in C. I posted a similar msg before, but I'll repeat.

Do _NOT_ take the code in that book as a good example or a bug-free implementation. The code there is a quick translation of its older and better half, the original "Numerical Recipes" (in Fortran77). Anyone who is serious about numerical methods should not consider the book as a ready-to-use C library, but read it as a general (introductory) algorithm book.

As for a better example of FFT in C, google for "FFTW" or "djbfft".


djbfft has its own assembly code in AT&T syntax. That would be a better starting point than translating C code.



I forgot to explain why NRiC array is not a (generic or proper) C array. The authors don't seem to understand C very well, but they do understand Fortran. They seem to make the indexing consistent between their Fortran code and the new quick-and-dirty translation into C. So, they actually waste sizeof(float) or sizeof(double) bytes for each vector.
Posted on 2003-05-13 22:51:10 by Starless
Thanks for the confirmation. I have it VERY close to working. (it *is* an indexing issue). However, when i get the magnitudes to line up perfectly, the phase is 180 out! (but alligned properly over its freq bins)..

Tweeking indexes pulls one in, and another out! Soooo frustrating....
However, thanks for the sugestion! Kinda upset with the book at this point (for all the time spent in debugging this crap).

:NaN:
Posted on 2003-05-13 23:28:57 by NaN
Sorry to go off-topic here - but finit?
Do you have to finit?

I NEVER bother, and it's always there when debugging - I assumed that MASM put it in for me or something..
Posted on 2003-05-14 01:26:39 by Homer
Homer, you say the finit instruction is there even though you haven't issued it? That's creepy if you're only doing assembly programming.

Windows might or might not have set up stuff for you, but there's generally no guarantees on register state when your process starts, so I tend to use finit (and set up rounding modes etc) when I do float stuff from assembly programs (which isn't very often :)). It's a few instructions at the start of my programs, and it lets me sleep at night ^_^
Posted on 2003-05-14 01:52:06 by f0dder
Hmm, getting OT... I avoided posting OT msg when I first saw someone mentioned finit, but it is hard to resist second time. :)

Actually, finit is the responsibility of OS. (Let's not argue that in DOS days, we had to do that. After all, it is a Win32 programming forum.) Otherwise, you won't have a reliable task switching when several processes use FPU. Modern x86 OS issues finit at the start up time, FPU registers are managed by the OS. To see how to do that, see, for example, Intel manual vol. 3.

I have never traced into Windows itself to verify that it actually manages FPU registers. But, I can infer, from my experience, that it does. Otherwise, my computation program and my small clock would have never worked at the same time.
Posted on 2003-05-14 02:43:48 by Starless
I gatter no-one has really looked over the source. Its the second finit comment, and its the first instruction in the proceedure...

So to answer you questions, yes I had consided this requirement in its initial design. The problem is 100% in the indexing of the for loops, since i have already gotten preciecely accurate numbers under certain situaltions (tweeking =, >, <, signs to work with a zero based index).

Im quite frustrated tho, and considering starting over with another source if i cant figure out the proper looping index start and stop values.

:NaN:
Posted on 2003-05-14 12:50:07 by NaN
Sooooooooooooooo BITTER!


Took all week of tracing code and funky indexing with this FFT algo. And finaly discovered the what the bug was!!

There was a minor index problem. But when corrected, everything was mathematically aligned except the phase of the transform would always be inverse of what it should be (opposite sign, right value in the right freq bin).

I traced and traced and traced through the code and always came up short.

Then when reviewing another algo version (with is identical to this) i noticed that their value for TwoPie was -6.28318530717959.

MINE WAS: 6.28318530717959. (180 degrees out of phase! ArGh!)

It wasnt the code, it was the constant. The ONLY constant!

This is a pretty nasty typo from the book! I double checked the Numberical Recipies in C and it has the positive value listed.... grrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr!

Anywho.. on the bright side, the FFT and IFFT algo runs fine now!

:NaN:
Posted on 2003-05-14 20:37:20 by NaN
Good job NaN! - It is quite a shame (or is it 'sham') that such an insidious error in the source prevails through publication. I'm sure I don't speek alone when I say 'I feel your pain'...

Anyhow, I did get a chance to further review your original post and, as you say, other than the data array offset/indexing (minor: adjust basepoint EDX), your code is a technically correct conversion. But I was wondering if your still interested in developing this further as I have some suggestions for optimization.
Posted on 2003-05-15 05:29:22 by TheoMcCloskey
Thanx.

I have no clue when it comes to rhym or reason for optomizations. This is why i come here ;) . But if you can and know how then i urge you to give it your mark (and suitable credit to yourself!).

The working version is in This Thread

There may be a way of normailizing the data as its being written back to the array data and data points. Saving the overhead of one more full pass, scaling the data back.

A test algorigthm is simple:
.386

.model flat,stdcall
option casemap:none

include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
include \masm32\include\user32.inc
include \masm32\include\masm32.inc
include \masm32\include\debug.inc

includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\user32.lib
includelib \masm32\lib\masm32.lib
includelib \masm32\lib\debug.lib

include fpdata.inc

DBGWIN_DEBUG_ON = 1
DBGWIN_EXT_INFO = 0

start:
push ebx
push esi


; Data set selection made easy, just cut and past into
; the equate the data you want to process.
atext TEXTEQU <IMPULSE> ; IMPULSE FP_DATA FDATA2

; Do forward FFT of data
invoke FFT, addr atext, 64, 1, 1 ; 64 Complex points

PrintText "===[ Forward FFT - Magnitude ]==="
lea edx, atext
xor ebx, ebx
.while (ebx < 128) ; Total array = 128 data
PrintDouble QWORD PTR [edx+ebx*8]
add ebx, 2
.endw
PrintText "===[ Forward FFT - Phase ]==="
lea edx, atext
xor ebx, ebx
inc ebx
.while (ebx < 128)
PrintDouble QWORD PTR [edx+ebx*8]
add ebx, 2
.endw


; Do reverse FFT of the data
invoke FFT, addr atext, 64, -1,0 ; 64 Complex points

PrintText " "
PrintText "===[ Reverse FFT - Magnitude ]==="
lea edx, atext
xor ebx, ebx
.while (ebx < 128)
PrintDouble QWORD PTR [edx+ebx*8]
add ebx, 2
.endw
PrintText "===[ Reverse FFT - Phase ]==="
lea edx, atext
xor ebx, ebx
inc ebx
.while (ebx < 128)
PrintDouble QWORD PTR [edx+ebx*8]
add ebx, 2
.endw


pop esi
pop ebx
invoke ExitProcess, NULL
end start
end


When in your "fpdata.inc" file the data is stored as will be labelled by the "atext" equate contents. Here is two sample sets of data. FP_DATA is the above 64 point data set with 5 full oscillations withing it. IMPULSE is an impulse at time = 0, which in the frequency domain (FFT) will produce 0 phase shift anywhere, and a constant amount of energy in all frequencies (After all an impulse is the sum of all frequencies). Here is the data sets:
[size=9]FP_DATA LABEL BYTE  ; 5 full cycles real in 64 data points (complex)

dq 0.0, 0.0
dq 0.4783, 0.0
dq 0.8400, 0.0
dq 0.9972, 0.0
dq 0.9115, 0.0
dq 0.6038, 0.0
dq 0.1490, 0.0
dq -0.3420, 0.0
dq -0.7498, 0.0
dq -0.9749, 0.0
dq -0.9626, 0.0
dq -0.7159, 0.0
dq -0.2948, 0.0
dq 0.1981, 0.0
dq 0.6428, 0.0
dq 0.9309, 0.0
dq 0.9922, 0.0
dq 0.8119, 0.0
dq 0.4339, 0.0
dq -0.0498, 0.0
dq -0.5214, 0.0
dq -0.8660, 0.0
dq -0.9997, 0.0
dq -0.8899, 0.0
dq -0.5633, 0.0
dq -0.0996, 0.0
dq 0.3884, 0.0
dq 0.7818, 0.0
dq 0.9848, 0.0
dq 0.9479, 0.0
dq 0.6802, 0.0
dq 0.2468, 0.0
dq -0.2468, 0.0
dq -0.6802, 0.0
dq -0.9479, 0.0
dq -0.9848, 0.0
dq -0.7818, 0.0
dq -0.3884, 0.0
dq 0.0996, 0.0
dq 0.5633, 0.0
dq 0.8899, 0.0
dq 0.9997, 0.0
dq 0.8660, 0.0
dq 0.5214, 0.0
dq 0.0498, 0.0
dq -0.4339, 0.0
dq -0.8119, 0.0
dq -0.9922, 0.0
dq -0.9309, 0.0
dq -0.6428, 0.0
dq -0.1981, 0.0
dq 0.2948, 0.0
dq 0.7159, 0.0
dq 0.9626, 0.0
dq 0.9749, 0.0
dq 0.7498, 0.0
dq 0.3420, 0.0
dq -0.1490, 0.0
dq -0.6038, 0.0
dq -0.9115, 0.0
dq -0.9972, 0.0
dq -0.8400, 0.0
dq -0.4783, 0.0
dq -0.0000, 0.0


IMPULSE LABEL BYTE
dq 1.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0
dq 0.0, 0.0[/size]


Enjoy! Hope you can speed up the algorithm. The faster it is, the more DSP work you can do in real time! Thanks for you interest on this project! To keep you informed, i plan on posting more DSP work and tools, such us MASM coders can start to do really *cool* stuff ;)

Side Note: An impulse is like a Lightning strike.. Very fast and then finished. Its no wonder that your radio will crackle from lightning activity, no matter what frequency your on. This is because an impulse floods Every frequency with energy (as shown in the FFT ;) )

:alright:
NaN
Posted on 2003-05-15 09:02:52 by NaN