This was suggested by eko as an addition to a previous question. If he doesn't mind I'm just going to reword here as a full question on its own as I think it works better that way.

You have two balls of mass m1 & m2. Each has a velocity u1 & u2. The velocities will be of different sign. Allowance should also be made foe the cahce that one ball isn't moving initially. The balls are placed so that they will colide, therefore you don't need to worry about the initial placing in you solution. Additionally, all movent should take place purely on the horizontal plane. So this is all 1 dimensional.

Here's some terible ASCII art to show what I mean.
u1 = 10 u2 = -3
O ---> <- o
m1 = 20 m2 = 10

After any collision each balls is probably going to fly off in a different direction, you need to find those velocities. There is also a additional value e which is called the coefficient of restitution. This controls how much energy is lost in the collision.

So to formalise the question.
u1 Initial velocity of ball1
m1 Mass of ball1
u2 Initial velocity of ball2
m2 Mass of ball2
e Coeffieient of restitution

v1 Final velocity of ball1
v2 Final velocity of ball2

All values should be real4, don't worry about acceleration, friction, etc.

Finally, seeing as some of us will already know the equations involved, its only fair to repeat them here so that everyone has a fair chance.

So first we have eko's one, The conservation of momentium equation.
m1u1 + m2u2 = m1v1 + m2v2

Secondly we have the energy transfer equation.
v2 - v1
------- = e
u1 - u2
Posted on 2002-02-12 05:39:27 by Eóin
No one yet? :(

Come on, give it a try. :alright:
Posted on 2002-02-14 02:49:19 by Eóin
Sorry E?in,

My maths is so lousy that if I can't count it on my fingers, it wil not happen. :tongue:

Posted on 2002-02-14 04:13:16 by hutch--
LOL, not to worry Hutch. At least you replied :grin:

Can I ask seriously though as to why the lack of intrest. The Svin hmself said I'm surpised that even given task weren't solve by many programmers. I have to say I'm inclined to agree with him.

Is it maths ability or assembly ability that keeps most of our members here from suggesting answers. I mean this question is difficult enough, but for most people where does the difficulty lie? Is it in the deriving of the formula need to express the answer in terms of the given values? Or is it in coding that formula in assembly.

What I'd like to see and I'm sure others would agree is many members here contributing solutions. It doesn't matter if your solution isn't the fastest nor the simplest. By trying it yourself first you'll better understand the various optimisations our more experienced programmers have to demonstrate. Anything you don't understand, just ask. :)

Also, even if you feel you couldn't code an equation in the FPU, why not post your equations. For example, the above two equations have to be combined to create equations for v1 & v2.

I'll leave it open till tomorrow, then if noone else has made an attempt I'll post the fully derived equations and hopefully someone will try and code them.

I just hope people aren't holding back because they're worried about embarassing themselves. Maths is hard enough without the added difficulty of FPU assembly code, everyone struggles at it. :alright:
Posted on 2002-02-14 07:48:08 by Eóin
It is a matter of time for me. :(
Posted on 2002-02-14 10:23:43 by bitRAKE
I realise that bitRAKE, with universtiy Thomas is probaly in the same position. Indeed I too am fairly restricted Monday to Thursday as I'm away at college and can only go online on the college PCs, which don't have Masm.

But I was curious mainly as too why the three go us are the only real contributers. Not the onlys at all, but definitly the main ones.

I'd like to see many people giving them a try. Worst case scenario you'll learn something new.
Posted on 2002-02-14 11:04:22 by Eóin
my simplified formula for calculating it
v1=( u2m2(1+e)+u1(m1-em2) )/(m1+m2)
I dont know how to use fpu :( but here is my try


imul edx,ecx ;u2*m2
inc edi ;e+1
imul edx,edi ;u2*m2(e+1)
dec edi ;e
imul edi,ecx ;e*m2
push eax ;store m1
sub eax,edi ;m1-e*m2
imul ebx,eax ;ebx(m1-e*m2)
add ebx,edx ;u2m2(1+e)+u1(m1-em2)
pop eax ;m1
add eax,ecx ;m1+m2
idiv ebx,eax ;v1

I didnt tested it hope it works :)
Posted on 2002-02-14 19:15:04 by LaptoniC
Congratulations LaptoniC, those equations generate the correct answers.

Your code unfortunatly doesn't work just right at the end with the idiv instruction. :(

Don't forget that idiv only takes one operand. And it divides EDX:EAX by that operand. Your code tries to divide EBX by EAX. The simple solution is the swap ebx and eax at the end. You also need to sign extend eax into edx.

With a bit of restructring of your code you can get all this to work nicely.

But congratulations again. Now does anyone else wish to convert it to FPU code. :)
Posted on 2002-02-15 11:19:07 by Eóin
Right, for all those interested, I'm going to do a detailed run down of the solution.

First we need the equations, LaptioniC provided us with those:
v1=(u2m2(1+e)+u1(m1-em2) )/(m1+m2)

These are correct, but I found a slightly different form of the equations was more efficient;
v1 = (m1u1 + m2e(u1 - u2)]/(m1 + m2)
v2 = v1 + e(u1 - u2)

My first equation and LaponiCs are the same, just reordered. But note the key bit in red which is duplicated for v1 & v2, this allows us to reuse it and save on some calculation.

Right, now we want to code this. Lets do it very simply, first we calculate v1 so lets look a the equation. The overall form is a/b. b is simply m1 + m2. We'll deal with the top bit, a, first.

Basically you want to break the equation into seperate bits. For example the start of it, m1u1, is seperate, so lets calculate that first

fld m1
fmul u1

Now you leave that on the stack and go to the next bit, m2. The first thing we see is multiplication by m2, however thats on its own so we should calculate inside the brackets first and simply use a final fmul m2 to multiply by m2.

Inside the bracket again the first thing we see is a variable, u2, on its own so we move on. Next we see e, again on its own so move on again. Ahh, now we have something to calculate, u1 - u2, here we go

fld u1
fsub u2

Right, now that we have u1 - u2 we can multiply by e to get e(u1 - u2).

fmul e

Next we can work out the subtracion u2 - e(u1 - u2). Be careful here, you can't use fsub u2 as that would give you e(u1 - u2) - u2. Instead we must use reverse subtraction

fsubr u2

And now multiply by m2 to get m2

fmul m2

At this stage you may remember that we left m1u1 on the stack when we calculated it, since we haven't messed with the stack it should now be back at st(1) so to get m1u1 = m2 we simplt add st & st(1), which is written as faddp st(1),st or the much simpler short form


Right, thats the top half, leave that on the stack and onto the bottom part, m1 + m2;

fld m1
fadd m2

We need to divide these now, so fdivp st(1),st would do it, but I'm using the short form again;


And hey presto we have calculated v1. You store that in a variable, however we're not going to pop it off the stack as we'll use it to calculate v2.

fst v1

To get v2 we add e(u1 - u2) to v1 which is on the stack.

fld u1
fsub u2
fmul e

And store the result

fstp v2

Now your're asking, what about optimistaions, what happened to that great plan to reuse the e(u1 - u2) bit. Well what I want'ed to describe to you there is the basics for coding an equation. Optimisations are harder to follow, but for the sake of it I'll go through the e(u1 - u2) bit.

Just after we ahve calculated e(u1 - u2) we will store it with fst.

fmul e
fst st(?)
fsubr u2

But where are we going to store, you could plonk it into a memory vriable, but lets be more efficient and put it in one of the free spots on the fpu stack. We know st(0) holds e(u1 - u2), and remember st(1) is still holding m1u1 at this stage, so put it in st(2).

What this mean is that when it comes to calculating v2, we have v1 in st(0) and e(u1 - u2) in st(1). But v2 is the sum of those two values so fadd and we're home free, no recalculation of e(u1 - u2).

Another good fpu optimisation is to try and never load a value more than once. The following show the most optimised version I could write. Hopefully anyone who wishs to will be able to follow this and learn.

		; Stack | st0			       | st1		| st2	   | st3      | st4	

fld m2 ; | m2 | | | |
fld u1 ; | u1 | m2 | | |
fld u2 ; | u2 | u1 | m2 | |
fld st(1) ; | u1 | u2 | u1 | m2 |
fsub st,st(1) ; | u1-u2 | u2 | u1 | m2 |
fmul e ; | e(u1-u2) | u2 | u1 | m2 |
fst st(4) ; | e(u1-u2) | u2 | u1 | m2 | e(u1-u2)

fsub ; | u2-(u1-u2) | u1 | m2 | e(u1-u2) |
fmul st,st(2) ; | m2(u2-(u1-u2)) | u1 | m2 | e(u1-u2) |

fld m1 ; | m1 | m2(u2-(u1-u2)) | u1 | m2+m1 | e(u1-u2)
fadd st(3),st ; | m1 | m2(u2-(u1-u2)) | m1 | m2+m1 | e(u1-u2)
fmulp st(2),st ; | m2(u2-(u1-u2)) | m1u1 | m2+m1 | e(u1-u2) |

fadd ; | m1u1+m2(u2-e(u1-u2) | m2+m1 | e(u1-u2) | |
fdivr ; |(m1u2+m2(u2-e(u1-u2))/(m2+m1) | e(u1-u2) | | |
fst v1 ; |(m1u2+m2(u2-e(u1-u2))/(m2+m1) | e(u1-u2) | | |
fadd ; | v1 + e(u1-u2) | | | |
fstp v2 ; | | | | |

And to end on an interesting note. When I looked at the VC++ output for the following,
v1 = (m1*m1 + m2*(u2-e*(u1-u2)))/(m2+m1);
v2 = v1 + e*(u1-u2);

It had generated the following code
Posted on 2002-02-15 18:21:32 by Eóin
"But I was curious mainly as too why the three go us are the only real contributers."

Because it's sooooo HEAVY....Most of us only got a D- in math...

I'm sure we all are enjoying stuff like this, we just don't know what to say or do just yet...It's nice to see the HEAVY'S having fun and optimizing wars....

Please Don't Stop

Posted on 2002-02-16 06:25:50 by cmax
No other questions? btw. i've found this thread while looking to all thread containing FPU. I'm too late indeed.:(
Posted on 2003-08-28 12:11:20 by inFinie
Its no harm to revive an old thread. Besides I need to correct a mistake I made. VC optimizing compiler can actually generate quite efficient code give the two equations;

v1 = (m1*u1 + m2*/(m1 + m2)
v2 = v1 + e*(u1 - u2)

It produces
fld u1
fmul e
fld u2
fsub st,st(1)
fmul m2
fld m1
fmul u1
faddp st(1),st
fld m1
fadd m2
fdivrp st(1),st
fld st
fstp v1
fxch st(1)
fadd st,st(1)
fstp v2
fstp st

This is actually the same length (no. of instructions) as the solution I suggested. Note though that the compiler never tends to access registers from st(2) or higher. This means it doen't end up storing any of the values on the stack and instead has to reload them when needed.

In the end it tends to run about 10% slower than my version. Its not much, but at least there still some hope for us.
Posted on 2003-08-28 18:24:14 by Eóin
Normalization of the equation for v1 may afford some simplication.
Note that v1 = (m3*u1 + )/(m3 + 1), where m3 = m1/m2.
Posted on 2003-08-28 18:47:49 by Poimander
Hope you discuss about Geometry, like Phytagoras, ArcCos and else. I need it.:alright:
Posted on 2003-08-29 11:50:07 by realvampire
Pominader thats a good suggestion. In a once off calculation the the act of normalising the values would probably offset any advantages, but there are plenty of situations where it would improve things.

It just goes to show the importance of algebra as much as coding.

realvampire, fell free to suggest math questions/problems, if I don't answer it'll probably only be because someone beat me to it.
Posted on 2003-08-29 12:13:21 by Eóin
I have Angle Function From the GLib. There are 3 Parameter to obtain the Angle. Ax=DeltaY, DX=DeltaY, and BX= address of Normalisation Factor. The question is what is normalisation factor?

THank you
Posted on 2003-09-04 21:33:17 by realvampire