Hi all, im having trouble figuring out the div-instruction.
what i thought it did was to divide eax with the source given, and return the quotient eax and the reamainder in edx...? but according to what my code tells me it does not.
heres the effective parts of it:


...

CPUID
RDTSC
push eax
invoke Sleep,1000
CPUID
RDTSC
pop ebx
sub eax,ebx
mov freq,eax

...
start_timer:
CPUID
RDTSC
mov tid,eax
jmp rerun

;rerun:
;invoke DefWindowProc,hWnd,uMsg,wParam,lParam

stopp_timer:
CPUID
RDTSC
sub eax,tid
xor edx,edx
div freq
...

After execution eax and edx seems to contain random numbers.
The labels are jmp'ed to by clicking of buttons, and i know that this will make a difference but not in the size of of theese figures. ill attach the full sourcecode aswell (and an assembled version of it).
Posted on 2004-01-25 19:45:56 by sluggo
div divides edx::eax by the operand, and stores quotient in eax, and remainder in edx yes...
I think you may have a problem with your use of rdtsc... it returns a 64 bit number, not a 32 bit one. so the 32 bit might wrap (only takes a few seconds on modern machines!), and appear random?
Posted on 2004-01-25 20:35:22 by Henk-Jan
As Henk-Jan mentionned, RDTSC returns a 64-bit value of clock cycles in EDX:EAX, the high order portion in EDX and the low order portion in EAX.

Therefore, you could only expect potentially erroneous results by keeping only the low order portion of the return value when you tried to establish the frequency using the Sleep,1000 with modern CPUs operating above the GHz range.

The CPUID instruction also needs a value in EAX to define which information about the CPU is required.

Raymond
Posted on 2004-01-25 23:21:32 by Raymond
but i still dont get it :confused:...
wouldnt that mean that this code always should produces a 1 (yes im on monster 600Mhz =) in eax:


CPUID
RDTSC
push eax
push edx
invoke Sleep,1000
CPUID
RDTSC
pop ecx
pop ebx
sub eax,ebx
sub edx,ecx
xor ecx,ecx
mov ecx,600000000
div ecx

most of the times it does, but every now and then i get an 8 instead.
thank you for your help:alright:!
Posted on 2004-01-26 05:13:52 by sluggo
are you sure you dont want to clear edx before div ??
Posted on 2004-01-26 05:35:03 by AceEmbler
Perhaps use sbb instead of sub when subtracting highorder dword?
Posted on 2004-01-26 07:07:37 by _js_
He doesn't want to clear edx when he needs a 64 bit division.
He should use sbb yes, and he might also want to do a long-division, because in some cases, you might get overflow this way.
You don't need the xor ecx, ecx before the mov ecx ofcourse.

Oh, and you don't really need to set eax before using cpuid, it is only used here to flush the pipeline.
Posted on 2004-01-26 07:14:49 by Henk-Jan


sluggo,
BTW, I think the translation of your name to sanscrit is wrong ;)

Posted on 2004-01-26 13:35:27 by MazeGen

It is?!:eek:
dunno sanscrit (just copied the letters of some chart i found through google), can you translate it correctly for me?


Thank you for your help, using sbb instead of sub made the trick for full seconds. but since its only the remainder thats left in edx how can i reach the decimal answer of the division. say i want to divide 7 (or to get it in 32bit 00000007) by 2. that will leave me with 3 in eax and 1 in edx. right?
the real problem is that im writing a timer and basicly the division i want to do is (rdtsc - rdtsc2)/ hz = sek (to get the frequence of the processor:


rdtsc
mov tscLDW,eax
mov tscHDW,edx
invoke Sleep,1000
rdtsc
sbb eax,tscLDW
sbb edx,tscHDW
mov freq,eax
) i cant seem to figure out to use the div-instuktion with the 64bit values returned by rdtsc...? sorry for a beeing a slowmo :( !
Posted on 2004-01-27 09:06:54 by sluggo
Use long-division as you were taught in school :)

You divide the high dword and the low dword separately... Let's say you want to do this:

c = a/b;

c is 64 bit, a is 64 bit, and b is 32 bit.

We can write a and c as 2 dwords:

a.hi*(2^32) + a.lo

Now, we fill that in:

c = (a.hi*(2^32) + a.lo) / b
c = (a.hi*(2^32)) / b + a.lo / b

a.hi*(2^32) / b == (a.hi/b) * (2^32)

So c.hi == (a.hi/b)

This is only true if you assume fractions however. Since we have integers, we also have a remainder. This remainder should still be processed with a.lo.
Note that the remainder also has a 2^32 factor in it, and a.lo is < 2^32 by definition. So we can simply use the remainder as the high dword now. The formed number cannot give a division overflow by definition, because the remainder must be smaller than b.

So to calc c.lo, we get (rem*(2^32) + a.lo)/b

In code, something like this:



mov eax, [a.hi]
mov edx, 0
div [b]
mov [c.hi], eax
mov eax, [a.lo] ; note that the remainder is already in edx!!
div [b]
mov [c.lo], eax ; actual remainder of complete division is now in edx


You can expand this routine to work on numbers of any size, divided by a 32 bit number.

PS: you should only use sbb for the second sub, for the first sub, you do not want to borrow, and the carry flag may be undefined, so sbb could give the wrong result, it will only give the correct result if carry happens to be unset.
Posted on 2004-01-27 09:57:41 by Henk-Jan
how can i reach the decimal answer of the division
Is this only for display purposes, or for further processing, or both? What accuracy do you need in either case?

The FPU can handle QWORD integers in memory. That would be a lot easier and faster than doing long divisions "by hand". Let me know if you need more help for that route.

Raymond
Posted on 2004-01-27 13:00:30 by Raymond
It's for both. for displaying i need an accuracy of six decimals (doesent need to be rounded off, just truncated) and in further processing as accurat as possible =)! all help mostly appreciated!
Posted on 2004-01-27 15:38:59 by sluggo
I guess double precision floats (doubles) are most accurate here, because the divisor is relatively large compared to the dividend, meaning you get a rather small quotient, and possibly a large fraction...
Then again, even with integer, you should have plenty of precision for most stuff, it depends on what you want to do.

a floating point division would go like this by the way (assuming the same a, b, c 64 and 32 bit numbers as in the previous example):


fild [a]
fidiv [b]
fistp [c]


So quite simple really, load one operand on FPU stack, divide by memory operand directly, store and pop result to destination (for floating point numbers, remove the 'i' from the mnemonic... It stands for 'integer' (fld, fdiv, fstp)).

Printing a decimal number is done by dividing out digit for digit... You can get the lowest digit by getting the remainder of a division by 10 (for decimal). Convert that number to ASCII (add '0', or 30h to it), store it, and move to the next one (divide the number by 10 to remove the lowest digit, then again take the remainder), until you have 0 left.
Posted on 2004-01-27 15:56:26 by Henk-Jan
How about the following:
.data


freq label qword
freqL dd ?
freqH dd ?

result dt ? ;extended double-precision

buffer db 32 dup(?) ;for string output

.code

CPUID
RDTSC
push edx
push eax
invoke Sleep,1000
CPUID
RDTSC
pop ecx
sub eax,ecx
pop ecx
sbb edx,ecx
mov freqL,eax
mov freqH,edx

...
start_timer:
CPUID
RDTSC
push edx
push eax
jmp rerun

;rerun:
;invoke DefWindowProc,hWnd,uMsg,wParam,lParam

stopp_timer:
CPUID
RDTSC
pop ecx
sub eax,ecx
pop ecx
sbb edx,ecx
push edx ;store on stack, H.O. 1st
push eax ; L.O. 2nd
finit
fild qword ptr[esp] ;load qword from stack
fild freq ;loads the frequency (qword)
fdiv
add esp,8 ;restore stack
invoke FpuFLtoA,0,6,ADDR buffer,SRC1_FPU or SRC2_DIMM
fstp result
...
The result stored in extended double-precision (REAL10) format will be the most precise avilable. The converted result to ascii will be with 6 decimal places. The FpuFLtoA function is part of the FPULIB available with the MASM32 package; that library of FPU functions is also available in some other thread of this forum if you don't have it.

Just ask if you need more help.

Raymond
Posted on 2004-01-27 22:46:17 by Raymond




It is?!:eek:
dunno sanscrit (just copied the letters of some chart i found through google), can you translate it correctly for me?

The main problem lies in -sl- and -gg-, it is difficult to translate for me.
If you are really interested, ask someone who know Hindi, they use the same script.
Posted on 2004-01-28 14:40:44 by MazeGen
Thank you Raymond!! worked like a charm! been reading through chapter 14 in AoA to make shure i understand it properly aswell :alright:.

ok, i have a uncle who translates sanscrit scripts, been meening to check it with him since i got the avatar (bout two yrs ago =). but havent got around to it yet, and right now hes in India!
Posted on 2004-01-28 17:01:21 by sluggo
sluggo

Glad to hear that the suggested code worked to your satisfaction. There's always a risk that such "untested" code snipets may have shortcomings when written out of context.

I've got Randy's AoA in PDF format but could not find any Chap.14. Which part do you not yet understand: the use of the RDTSC or the use of the FPU instructions.

Raymond
Posted on 2004-01-29 10:27:08 by Raymond
here is an online transliterator for most of the indian languages

http://www.aczone.com/itrans/online/

see the .gif attached

i typed in sluggo
it gave me this

;)
Posted on 2004-01-29 11:46:16 by bluffer
Sorry for these offtopics...

Wow bluffer,
it's nice transliterator, even though it seems it's a bit buggy ('MA' at the beginning is translated as 'AM', for instance). The transliteration is AFAIK probably correct. There is one problem: "l" in s"l"uggo may be in this case both a vowel and a consonant, and I think some automatic translator can't choose always the best eventuality.
;)
Posted on 2004-01-29 13:55:22 by MazeGen
@Raymond it was the FPU instructions, havent worket with it at all really, but i got the hard copy (http://webster.cs.ucr.edu/Page_asm/ArtofAssembly/0_HardCopy.html), the chapters are diffrent in that one. number 14 is all about floating point arithmetics!

@bluffer thank you, ive got a new avatar =)!
Posted on 2004-01-29 16:12:47 by sluggo