hi all,

was not very good on mathematics at school... :)
and i'm not strong in assembler signed operations... :(

need a litle help

i have to do the signed sum of two numbers, the first could be negative or positive, the second is positive...

is there a better way than compare values and using "cmp", "neg" intruction, jumps .... ?

thanks B7
Posted on 2003-08-23 15:50:51 by Bit7
Originally posted by Bit7
is there a better way than compare values and using "cmp", "neg" intruction, jumps .... ?

How about add?

Thomas
Posted on 2003-08-23 15:59:49 by Thomas
Hi thoma

what happens if we have a number like:

0111 1111 + 0100 0000 = 1011 1111 <---- negative???

peace
Posted on 2003-08-23 16:04:55 by mistronr1

Hi thoma

what happens if we have a number like:

0111 1111 + 0100 0000 = 1011 1111 <---- negative???

peace


Since a signed byte range is from -128 to +127, your calculation causes an overflow (127 + 64 > 127). The same thing happens when you add for example 255 and 255 with unsigned arithmetic. In both cases the result is invalid, although the flags will be set so that you can detect this and take the right action.

Thomas
P.S. signed numbers addiction and addition are two totally different things :grin:...
Posted on 2003-08-23 18:00:22 by Thomas
Im having a hard time getting just what Bit7 wants, but has it been made clear that you can use the .if/.endif to comare for signed numbers.



mov eax, PositiveOrNegative
mov edx, PositiveOnly
.if( SDWORD PTR eax > = 0 )
add eax, edx
jc @PositiveAddOverflowError
.else
add eax, edx
jc @NegativeAddOverflowError
.endif


Maybe im way off base here.. i dunno..
:alright:
NaN
Posted on 2003-08-23 19:31:09 by NaN
I would suggest the OR instruction to test if the first number is negative or positive. If negative, the sign flag will be set. Under such condition, adding a negative number with a positive number can never cause an overflow. The sign of the result will depend on which one had the greatest absolute value.

If both are positive, you then have to test the result for an overflow which would set the sign flag and indicate an invalid result.

The following example uses 8-bit registers but you can modify it for any size register.


mov al,firstnum
or al,al
jns @F ;jump if first number is positive
add al,secondnum
jmp finish
@@:
add al,secondnum
js OVERFLOW ;go handle the overflow error
finish:


Raymond
Posted on 2003-08-23 21:56:50 by Raymond
thanks all, and sorry for my bad explaination.

Yes, my problem was the addition of a positive or negative with a positive number.

So seems i have to introduce comparisons and jumps... i would like to avoid this becouse i have a grat routine with about 16 of there operations, and i think that many "cmp/jump" will make my routine run too slow.....

What about introducing floating ?
Posted on 2003-08-24 00:09:42 by Bit7
Bit7, maybe you're just getting confused by the fact that the same add and sub instructions are used for both signed and unsigned. In your example

0111 1111 + 0100 0000 = 1011 1111 <---- negative???

The answer is both negatinve and positive, its depends on what instructions you use with it from then on. If you use eg mul or jb then the number is treated as positive, whereas using idiv or jg would treat the number as negative. Using add and sub it won't matter.

In floating point all numbers are signed, you can't get unsigned numbers.
Posted on 2003-08-24 06:48:19 by Eóin
I think Eoin is correct. It does not matter if the number is positive or not, but what that really matter is whether you are adding unsigned numbers to signed numbers or something like that.

If both are sign, no problem, just use plain add. If one is unsigned and the other is unsigned, then it makes things hard.
Posted on 2003-08-24 07:02:21 by roticv
Bit7, theres an important bit you don't mention. When you say the second number is positive do you mean positive in the signed range 0-127 or in the unsigned range 0-255.

If its in the signed positive range then don't worry about checking signs just add and sub watching out for overflows. If its in the unsigned range then it only makes sense to treat the answer as an unsigned number. Eg -10 + 200 = 190 if you're taking unsigned numbers, or -66 if you're taking signed.

The lesson here is don't mixed signed and unsigned numbers too much. Positive and negative are ok to mix, just keep them within their signed ranges.
Posted on 2003-08-24 07:37:56 by Eóin
roticv, eion, thanks again,

i think i have understand that add and sub work just with number, don't care about the sign, so i've to check an mange it.

Since i've discovered i need just another operation, where both the numbers could be neg or pos... i decide to go for floating to go in an easier way.





fild [y]
fiadd [cr] can be pos/or negative
fidiv [i256]
fiadd [RTable2] can be pos/or negative
fistp [offTable]
mov al, [byte ptr offTable]
mov [byte ptr edi+2], al




so i let the FPU do the homeworks for me :) but i think that using FPU make my routine slower... is it ?
Posted on 2003-08-24 08:49:10 by Bit7
It will make it very much slower, specially that division instruction.

If you are actually dividing by 256, you could then use the very fast SHR instruction instead of the DIV with the CPU.

Raymond
Posted on 2003-08-24 09:12:04 by Raymond
All this guess work. It would be alot more efficient to help you if we know roughtly what your algorithm was to be.. (I can interperent from you floating point listing), but im not sure where the end result is....

:NaN:
Posted on 2003-08-24 10:22:28 by NaN
it's jus a rotine i'm converting from C
The routine perform a YUV 4:1:1 image format to RGB, 24 bit per pixel.


since you're asking, this is the main cycle code



while (wly--) {
for (i=0; i<lx; i+=4) {
cr=*crp++;
cr-=128;
cb=*cbp++;
cb-=128;
cg=cr;
cg+=cb;
cr*=409;
cg*=-617;
cb*=517;
cg+=cr;
cg+=cb;
y=MulTable[*src++];
*rp=Table[(y+cr)>>8];
rp+=rinc;
*gp=Table[(y+cg)>>8];
gp+=ginc;
*bp=Table[(y+cb)>>8];
bp+=binc;
y=MulTable[*src++];
*rp=Table[(y+cr)>>8];
rp+=rinc;
*gp=Table[(y+cg)>>8];
gp+=ginc;
*bp=Table[(y+cb)>>8];
bp+=binc;
y=MulTable[*src++];
*rp=Table[(y+cr)>>8];
rp+=rinc;
*gp=Table[(y+cg)>>8];
gp+=ginc;
*bp=Table[(y+cb)>>8];
bp+=binc;
y=MulTable[*src++];
*rp=Table[(y+cr)>>8];
rp+=rinc;
*gp=Table[(y+cg)>>8];
gp+=ginc;
*bp=Table[(y+cb)>>8];
bp+=binc;
}
}
}



i've just converted, seems work... but i have to discover if is faster or slower heheheh

B7
Posted on 2003-08-24 11:00:29 by Bit7

If you are actually dividing by 256, you could then use the very fast SHR instruction instead of the DIV with the CPU.
Of course, SAR for signed numbers - the FPU only works with signed numbers.
Posted on 2003-08-24 11:12:01 by bitRAKE
A good rule to remember when using the:

add
sub
cmp

instructions is to use these jumps when dealing with unsigned numbers only:

JB, JBE, JA, JAE these test the Carry flag

and use these jumps when dealing with signed numbers only:

JL, JLE, JG, JGE these test the Sign & Overflow flags

farrier
Posted on 2003-08-24 14:05:55 by farrier
The nice thing about two's complement arithmetic is that adding (and subtracting) numbers is exactly the same for signed and unsigned. That's why, unlike multiply and divide, there aren't separate signed/unsigned versions of ADD and SUB. As always, arithmetic overflow will invalid your results (unless you want the "wraparound" feature.)
Posted on 2003-08-25 16:29:43 by tenkey
thanks all again, now i've understand what i was missing :

for some stupid reason (i've some great black holes about the knowledge of signed numbers), i tought that

11111111 = -128
10000001 = -1
:stupid:

instade, reading the book Advanced Assembly Language .. i've soon discovered that

11111111 = -1
10000000 = -128

so i've also soon discovered why the Intel processors have the imul or idiv and not iadd or isub :)
It worth some moderator please move my "posted" count to 0 or better -1 :)

Ok, now i've got it hardly and i hope i'll remember it well.
So on Sunday, when guys of my town go to the beach, i tought to change my YUV4:1:1 to RGB routine, avoiding the slow floating instructions.
This is the last release :)



;---------------------------------------------------------------------------
proc ImageToRGB uses ebx edx esi edi, src:dword, dst:dword, dstpitch:dword, lx:dword, ly:dword

;/* src : image in CM_YUV411P format
;/* dst : destination buffer
;/* dstpitch : dest. line pitch, pass zero to get the default
;/* lx,ly : image dimensions in pixels
;/* wy, wly : sub-image window to process (0,0 for all)

;B = 1.164(Y - 16) + 2.018(U - 128)
;G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
;R = 1.164(Y - 16) + 1.596(V - 128)

dataseg

RoundTable db 300 dup(0), 256 dup(0), 300 dup(255)
MulTable dd 256 dup(0)
Inited dd 0

codeseg

LOCAL i, y, RTable2: dword
LOCAL cr, cg, cb : dword

cmp [Inited],0
jne @@st03

; Round-Table initialization
or [Inited],1
lea edi,[RoundTable]
add edi,300
xor ecx,ecx
@@st00: mov [byte ptr edi],cl
inc ecx
inc edi
cmp ecx,256
jl @@st00

lea edi,[MulTable]
xor ecx,ecx
@@st02: mov eax,ecx
sub eax,16
imul eax,298
add eax,128
mov [dword ptr edi],eax
add edi,4
inc ecx
cmp ecx,256
jl @@st02

; preparo i puntatore croma red e croma blu

@@st03: push offset RoundTable + 300
pop [RTable2]

mov esi,[src] ; esi puntatore a luminanza

mov eax,[lx]
imul eax,[ly]
push eax
mov ebx,eax
add ebx,[src] ; ebx croma red pointer

pop eax
shr eax,2
mov edx,ebx
add edx,eax ; edx croma blu pointer

mov edi,[dst] ; edi RGB blue ptr

; ciclo di conversione

@@st04: xor ecx,ecx

@@st05: movzx eax,[byte ptr ebx]
mov [cr],eax
sub [cr],128

movzx eax,[byte ptr edx]
mov [cb],eax
sub [cb],128

push [cr]
pop [cg]

mov eax,[cb]
add [cg],eax

mov eax,[cr]
imul eax,409
mov [cr],eax

mov eax,[cg]
imul eax,-617
mov [cg],eax

mov eax,[cb]
imul eax,517
mov [cb],eax

mov eax,[cg]
add eax,[cr]
add eax,[cb]
mov [cg],eax

movzx eax,[byte ptr esi]
shl eax,2
push [dword ptr offset MulTable + eax]
pop [y]

mov eax,[y]
add eax,[cr]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi],al

mov eax,[y]
add eax,[cg]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+1],al

mov eax,[y]
add eax,[cb]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+2],al

inc esi ; increment limunance ptr
add edi,3 ; increment RGB ptr

movzx eax,[byte ptr esi]
shl eax,2
push [dword ptr offset MulTable + eax]
pop [y]

mov eax,[y]
add eax,[cr]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi],al

mov eax,[y]
add eax,[cg]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+1],al

mov eax,[y]
add eax,[cb]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+2],al

inc esi ; increment limunance ptr
add edi,3 ; increment RGB ptr

movzx eax,[byte ptr esi]
shl eax,2
push [dword ptr offset MulTable + eax]
pop [y]

mov eax,[y]
add eax,[cr]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi],al

mov eax,[y]
add eax,[cg]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+1],al

mov eax,[y]
add eax,[cb]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+2],al

inc esi ; increment limunance ptr
add edi,3 ; increment RGB ptr

movzx eax,[byte ptr esi]
shl eax,2
push [dword ptr offset MulTable + eax]
pop [y]

mov eax,[y]
add eax,[cr]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi],al

mov eax,[y]
add eax,[cg]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+1],al

mov eax,[y]
add eax,[cb]
sar eax,8
add eax,[RTable2]

mov al,[byte ptr eax]
mov [byte ptr edi+2],al

inc esi ; increment limunance ptr
inc ebx ; increment of croma red ptr
inc edx ; increment of croma blu ptr
add edi,3 ; increment RGB ptr

add ecx,4
cmp ecx,[lx]
jl @@st05

dec [ly]
cmp [ly],0
jg @@st04

ret

endp ImageToRGB



I will do some measurements, my purpose is to make it faster than the vc++ one, any suggestion/trick is appreciated.

Thanks all again, B7
Posted on 2003-08-26 16:45:18 by Bit7