You've done it very well.

It's optimal from both math and asm point of view (well pared).

Now try to do it in FPU.

Assume that

.data?

x real8

y real8

subOfXYsq real8

BTW I hope you use debugger.

Actually nothing talks as clear as learning instructions and

testing pieces of code in favour of using a debugger.

And consequently of need to manage fast and easy loading written code into a debugger (I do it by one key stroke in my shell).

here is my try in fpu (like i said before i'm not good in this )

```
```

fld x ;st(0)=x

fadd y ;fst(0)=x+y

fld x ; st(1)=x+y , st(0)=y

fsub y ; st(1)=x-y

fmul

fstp subOfXYsq

eko:

Here is one simple task more. (5th grade)

eax = side of square1

ecx = side of squre2

We don't know which one is bigger.

Task:

Find positive (abs) difference of perimeters of these two squares

without branching.

Give solutions both for fpu and integer.

Another task:

Find sum of sign-changing HEX figures in dword.

Sign-changing means if you have value in hex 1234AFBCh

you need to find

1-2+3-4+A-F+B-C

It can be represented by difference of sum of odd figures and even figures

(1+3+A+B)-(2+4+F+C)

Get Anger.hlp with instruction set, and using it try to optimize for optimal paring.

Good Luck!

We don't know tricks - we just invent them ;)

I thought about the task above and have posted my first tries.

Please don't look if you wish to solve yourself. ;)

I couldn't help myself :)

Good code.

1st is the same logic but one clock faster than mine.

I missed an obvious thing that - - 1 is the same as +1 :)

Problem of second code, dispite of good ideas, is dependences

it is 11 clocks timing:

mov edx,eax ;1

and eax,0F0F0F0Fh ;0

shr edx,4 ;1

and edx,0F0F0F0Fh ;1

or edx,10101010h ;1

sub edx,eax ;1

mov eax,edx ;1

shr edx,16 ;1

add eax,edx ;1

add al,ah ;1

and eax,0FFh ;1

sub eax,64 ;1

here is one (I have 5 different versions) possible solution

to make it independence ( 7 clocks):

mov ebx,eax

shr eax,4

and ebx,0f0f0f0fh

and eax,0f0f0f0fh

mov edx,ebx

mov ecx,eax

rol edx,16

rol ecx,16

add ebx,edx

add eax,ecx

add bl,bh

add al,ah

sub al,bl

