I have one array of values (real4) with at least 12 values, and I want to convert that array into another array of exactly 12 values, which represent the averages of each part of the original array.
A simple example: Imagine the first array has 36 values, I want to convert that array into an array of 12 values, where each values is the avarage of 3 values from the original array..
Problems arise when the first array isn't a multiple of 12. I still want accurate averages, so the data has to be interpolated.
I'm looking for a simple algorithm to do this. I can think of some myself but I was wondering if there's a good one around.

Thomas
Posted on 2002-01-16 15:14:21 by Thomas
A good metaphor would be the line draw algorithm.
Think of that algorithm, and you should see the simple
solution - if you have no idea what I'm talking about,
I'll have to add more when I'm at home - too busy at work. ;)
Posted on 2002-01-16 15:35:09 by bitRAKE
This appears to be a working algo:
``````; arrDest    Pointer to buffer of size lenDest*4.  This buffer will hold an
;            array of REAL4 values returned from proceedure.
; arrSrc     Pointer to buffer of size lenSrc*4.  This buffer holds lenSrc REAL4
;            values to be averaged by the proceedure.
;
NonUniformAverages PROC uses esi edi, arrDest:DWORD, arrSrc:DWORD, lenDest:DWORD, lenSrc:DWORD
mov esi,arrSrc
mov edi,arrDest

mov edx,lenDest
cmp edx,lenSrc
jge Error

xor eax,eax
div lenSrc

fild lenSrc
fild lenDest
fdiv		; number of source positions per destination position (Note: >1)

fldz ; init

test edx,edx
jne @F

jnc Main

; load next value and split it into parts
; the fractional part in EDX indicates how much of next
; value goes into this and next position of destination
mov lenSrc,edx ; we use this for temp storage
fild lenSrc

fdiv fpc(4294967296.0) ; 2^32

fld REAL4 PTR [esi]
fld st ; two copies
fmul st(0),st(2)
fsubp st(1),st(0)

fxch st(2)
fstp st(1)
fdiv st(0),st(2)
fstp REAL4 PTR [edi]
dec lenDest
jne @B
ret

Error:	jne @F
; this isn't exactly an error ;)
mov ecx,edx
rep movsd
or eax,-1
ret

@@:	xor eax,eax ; error checking isn't a bad thing
ret
NonUniformAverages ENDP``````
I'd write some test code and work through the algo a bit more thoroughly before claiming it worked...but it looks good. ;)
Posted on 2002-01-17 00:43:00 by bitRAKE
Thanks, looks good. I'll look at it later, don't have time right now.

Thomas
Posted on 2002-01-17 00:53:25 by Thomas
Well it works most of the time but it has some strange behaviour.. Sometimes it produces outputs above 1.0 while all inputs are 1.0 or less...
Look at the attachment below, the inputs are almost all at the maximum (black line), but the averages (blended color) seems to fluctuate quite a bit.
Posted on 2002-01-17 13:03:24 by Thomas
Here's some code I tried, it produces invalid output values sometimes though.. Probably because it reads invalid values somewhere but I didn't found it yet. The algorithm I figured out is described in the attachment, the code only calculates the extended area, and does not yet substract the overflow parts. It's also not optimized in any way, I will when it's working..

Thomas

``````
[size=9]
NonUniformAverages PROC uses esi ebx edi arrDest:DWORD, arrSrc:DWORD, lenDest:DWORD, lenSrc:DWORD
LOCAL leftRest:DWORD
LOCAL rightRest:DWORD
LOCAL temp:DWORD
finit
mov     esi,arrSrc
mov     edi,arrDest

fild    lenSrc
fidiv   lenDest         ;lenSrc/lenDest

fld1
fadd    st(0),st(0)     ;st(0) = 2, st(1)=lenSrc/lenDest

xor     ecx, ecx
.WHILE  ecx<lenDest

; calculate (lenSrc/lenDest) * ecx, rounded down.
mov     eax, lenSrc
mul     ecx
xor     edx, edx        ;not really necessary
div     lenDest
mov     ebx, eax
mov     leftRest, edx

; calculate (lenSrc/lenDest) * (ecx+1), rounded up.
mov     eax, lenSrc
inc     ecx
mul     ecx
dec     ecx
xor     edx, edx        ;not really necessary
div     lenDest
or      edx, edx
jz      @F
inc     eax
@@:
mov     rightRest, edx

cmp     eax, ebx
je      @lr_eq

fldz
mov     edx, ebx
inc     edx
.WHILE  edx<=eax
inc     edx
.ENDW
fdiv    st(0),st(1)

jmp     @F
@lr_eq:
fld     real4 ptr [esi+4*eax]
@@:

.IF     leftRest
;calculate and substract left overflow
;yet to do
.ENDIF

.IF     rightRest
;calculate and substract right overflow
;yet to do
.ENDIF

;st(0)=area
;st(1)=2
;st(2)=lenSrc/lenDest

fdiv    st(0),st(2)
fstp    real4 ptr [edi+4*ecx]   ; area/width = average
inc     ecx
.ENDW

fstp    st(0)       ;kill 2
fstp    st(0)       ;kill lenSrc/lenDest

ret
NonUniformAverages endp
[/SIZE]
``````
Posted on 2002-01-17 13:27:16 by Thomas
I do the same algo, but since were are dealing with an array, we know that the values substracted at the end of the algo are added to the next partition. I didn't test with values <1, but will - it should work and be fast. ;)
Posted on 2002-01-17 14:15:19 by bitRAKE
I do the same algo, but since were are dealing with an array, we know that the values substracted at the end of the algo are added to the next partition

Clever, didn't think of that..
I don't know if the problems only occur with small values, but I need the algo for values between -1 and 1.
Here's an example that causes problems:
``````
inData  REAL4   0.0, -0.1788022, -0.7071068, -0.9645574
REAL4   -0.9862856, -1.0, -0.9964929, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0
REAL4   -0.9958844, -0.8571673, -0.5150281, 0.0
...
``````

I know the inputs are a bit weird but they are ripped from a memory dump and I don't know which values cause the problems.
All values are between -1 and 1 so the averages should lie between those values as well.
However, the 4th output value is -1.189916, the 8th is -1.702422, which are both out of range.
Could it be the presision of the algorithm? I tried to move the division by 2^32 so it would multiply first, then divide (more accurate) but didn't help. And as the FPU uses 80-bit registers for internal use it shouldn't be a problem.

Thomas
Posted on 2002-01-17 14:46:12 by Thomas
Presision shouldn't be a problem because the size of the arrays are so small - must be the algo - I threw it together really quickly. I didn't even work through the math - which I should have done. I just patched things together until the number output were what I expected. :tongue: Thanks for the values - I will surely try them when I get home.
Posted on 2002-01-17 14:57:19 by bitRAKE
Well, I found a few errors. I'm really surprised the presision is so bad - must still be an error some where? This algo seems right to me. I'll optimize it and see if I can find some bugs. Created serveral tests and they appear to work, but then the data above still produces values out of range? (but close) Let me know if you find something.
Posted on 2002-01-17 23:01:58 by bitRAKE
Just for clarification, here is a picture of what my algo is doing:

Escentially, EAX is the width of a source item, and when EDX overflows it means we've passed another destination width. At the time of overflow EDX holds the width of the second piece of the split source item. The split itself is when EDX=0 (red line: second picture).
Posted on 2002-01-19 00:30:29 by bitRAKE
Oh, and the working algo - almost forgot.
Still needs to be optimized. ;)
``````; arrDest    Pointer to buffer of size lenDest*4.  This buffer will hold an
;            array of REAL4 values returned from proceedure.
; arrSrc     Pointer to buffer of size lenSrc*4.  This buffer holds lenSrc REAL4
;            values to be averaged by the proceedure.
;
NonUniformAverages PROC uses esi edi, arrDest:DWORD, arrSrc:DWORD, lenDest:DWORD, lenSrc:DWORD
LOCAL fTemp:QWORD

mov esi,arrSrc
mov edi,arrDest

mov edx,lenDest
cmp edx,lenSrc
jge Error

xor eax,eax
div lenSrc

and DWORD PTR [fTemp+4],0
; 1/2^32
fld fpc(0.000000000232830643653869628906250)

fild lenDest
fidiv lenSrc

fldz ; init

Main:
jc @F

jmp Main
@@:

mov DWORD PTR [fTemp],edx
fild fTemp
fmul st(0),st(3)
fmul REAL4 PTR [esi]
fdiv st(0),st(2)
fxch st(1)

fsub st(0),st(1)
fmul st(0),st(2)
fstp REAL4 PTR [edi]

dec lenDest
jne Main

fstp st(0)
fstp st(0)
fstp st(0)
ret

Error:
jne @F
; this isn't exactly an error ;)
mov ecx,edx
rep movsd
or eax,-1
ret

@@:	xor eax,eax ; error checking isn't a bad thing
ret
NonUniformAverages ENDP

Posted on 2002-01-19 00:57:21 by bitRAKE
Thanks a lot! :alright: It works like a charm now.
All output values are correct too.

Thomas
Posted on 2002-01-19 03:29:10 by Thomas
Your welcome. The FPU isn't something I'm very familar with.

Are you using this algo to generate inputs for a neural net from more dynamic data?
Posted on 2002-01-19 04:34:51 by bitRAKE
Are you using this algo to generate inputs for a neural net from more dynamic data
.

Exactly, I now convert the angle to sine and cosine values. De black line is the sine value graph which I average to 12 values (the blended color) to get the inputs.
It works much better now, I was able to recognize eight different characters with a training set of each character drawn only once..

Thomas
Posted on 2002-01-19 04:52:35 by Thomas
Very impressive! I'm still working on the MMX versions. Initially, I'm just working to produce optimal output generators. Then I'll work on trying to train them. It's harder than I thought - trying to do plain MMX so everyone can use it.
Posted on 2002-01-19 05:02:26 by bitRAKE
I don't know which ranges the weights exactly are but because floating points are used they are very flexible (very small values like 1e-10 to very large values like 1e10). It's hard to do the same with integer values. You also need to calculate e^x on mmx regs... Seems very hard to me. Instruction sets like 3dnow are easier I think but you need to write them seperately for intel and amd :(. MMX is generally supported.
If you need help understanding the algorithm just ask, it took me quite some time to understand.

Thomas
Posted on 2002-01-19 05:08:07 by Thomas
Actually, I'm trying to use another method - wouldn't even bother with e^X in SMD integer...scary! ;) That function was only chosen because of it's properties - if I could do the same algorithmically with another function it should still work, or so I hope? It's all in fun and learning.
Posted on 2002-01-19 05:47:06 by bitRAKE
You can choose any function to do that, even a hardlimiting treshold function that only outputs 0 or 1, but the formulas in the back-propagation algo are derrivated from that formula, so you will need to derrivate them yourself (yuck :tongue: ).

Thomas
Posted on 2002-01-19 06:09:48 by Thomas
'Yuck' is a good word for it. I'm leaning towards a genetic approach combined with some back-propagation. When the algo is this fast learning can take 1000x longer. Okay, maybe it's more like a 1000000x times longer. ;) I really want some back-propagation as well.

Are you randomizing the inputs when you train your NN's?
Posted on 2002-01-19 06:21:26 by bitRAKE