Hi all, im having trouble figuring out the div-instruction.

what i thought it did was to divide eax with the source given, and return the quotient eax and the reamainder in edx...? but according to what my code tells me it does not.

heres the effective parts of it:

After execution eax and edx seems to contain random numbers.

The labels are jmp'ed to by clicking of buttons, and i know that this will make a difference but not in the size of of theese figures. ill attach the full sourcecode aswell (and an assembled version of it).

what i thought it did was to divide eax with the source given, and return the quotient eax and the reamainder in edx...? but according to what my code tells me it does not.

heres the effective parts of it:

```
```

...

CPUID

RDTSC

push eax

invoke Sleep,1000

CPUID

RDTSC

pop ebx

sub eax,ebx

mov freq,eax

...

start_timer:

CPUID

RDTSC

mov tid,eax

jmp rerun

;rerun:

;invoke DefWindowProc,hWnd,uMsg,wParam,lParam

stopp_timer:

CPUID

RDTSC

sub eax,tid

xor edx,edx

div freq

...

After execution eax and edx seems to contain random numbers.

The labels are jmp'ed to by clicking of buttons, and i know that this will make a difference but not in the size of of theese figures. ill attach the full sourcecode aswell (and an assembled version of it).

div divides edx::eax by the operand, and stores quotient in eax, and remainder in edx yes...

I think you may have a problem with your use of rdtsc... it returns a 64 bit number, not a 32 bit one. so the 32 bit might wrap (only takes a few seconds on modern machines!), and appear random?

I think you may have a problem with your use of rdtsc... it returns a 64 bit number, not a 32 bit one. so the 32 bit might wrap (only takes a few seconds on modern machines!), and appear random?

As Henk-Jan mentionned, RDTSC returns a 64-bit value of clock cycles in EDX:EAX, the high order portion in EDX and the low order portion in EAX.

Therefore, you could only expect potentially erroneous results by keeping only the low order portion of the return value when you tried to establish the frequency using the Sleep,1000 with modern CPUs operating above the GHz range.

The CPUID instruction also needs a value in EAX to define which information about the CPU is required.

Raymond

Therefore, you could only expect potentially erroneous results by keeping only the low order portion of the return value when you tried to establish the frequency using the Sleep,1000 with modern CPUs operating above the GHz range.

The CPUID instruction also needs a value in EAX to define which information about the CPU is required.

Raymond

but i still dont get it :confused:...

wouldnt that mean that this code always should produces a 1 (yes im on monster 600Mhz =) in eax:

most of the times it does, but every now and then i get an 8 instead.

thank you for your help:alright:!

wouldnt that mean that this code always should produces a 1 (yes im on monster 600Mhz =) in eax:

```
```

CPUID

RDTSC

push eax

push edx

invoke Sleep,1000

CPUID

RDTSC

pop ecx

pop ebx

sub eax,ebx

sub edx,ecx

xor ecx,ecx

mov ecx,600000000

div ecx

most of the times it does, but every now and then i get an 8 instead.

thank you for your help:alright:!

are you sure you dont want to clear edx before div ??

Perhaps use sbb instead of sub when subtracting highorder dword?

He doesn't want to clear edx when he needs a 64 bit division.

He should use sbb yes, and he might also want to do a long-division, because in some cases, you might get overflow this way.

You don't need the xor ecx, ecx before the mov ecx ofcourse.

Oh, and you don't really need to set eax before using cpuid, it is only used here to flush the pipeline.

He should use sbb yes, and he might also want to do a long-division, because in some cases, you might get overflow this way.

You don't need the xor ecx, ecx before the mov ecx ofcourse.

Oh, and you don't really need to set eax before using cpuid, it is only used here to flush the pipeline.

sluggo,

BTW, I think the translation of your name to sanscrit is wrong ;)

It is?!:eek:

dunno sanscrit (just copied the letters of some chart i found through google), can you translate it correctly for me?

Thank you for your help, using sbb instead of sub made the trick for full seconds. but since its only the remainder thats left in edx how can i reach the decimal answer of the division. say i want to divide 7 (or to get it in 32bit 00000007) by 2. that will leave me with 3 in eax and 1 in edx. right?

the real problem is that im writing a timer and basicly the division i want to do is (rdtsc - rdtsc2)/ hz = sek (to get the frequence of the processor:

```
```

rdtsc

mov tscLDW,eax

mov tscHDW,edx

invoke Sleep,1000

rdtsc

sbb eax,tscLDW

sbb edx,tscHDW

mov freq,eax

) i cant seem to figure out to use the div-instuktion with the 64bit values returned by rdtsc...? sorry for a beeing a slowmo :( !Use long-division as you were taught in school :)

You divide the high dword and the low dword separately... Let's say you want to do this:

c = a/b;

c is 64 bit, a is 64 bit, and b is 32 bit.

We can write a and c as 2 dwords:

a.hi*(2^32) + a.lo

Now, we fill that in:

c = (a.hi*(2^32) + a.lo) / b

c = (a.hi*(2^32)) / b + a.lo / b

a.hi*(2^32) / b == (a.hi/b) * (2^32)

So c.hi == (a.hi/b)

This is only true if you assume fractions however. Since we have integers, we also have a remainder. This remainder should still be processed with a.lo.

Note that the remainder also has a 2^32 factor in it, and a.lo is < 2^32 by definition. So we can simply use the remainder as the high dword now. The formed number cannot give a division overflow by definition, because the remainder must be smaller than b.

So to calc c.lo, we get (rem*(2^32) + a.lo)/b

In code, something like this:

You can expand this routine to work on numbers of any size, divided by a 32 bit number.

PS: you should only use sbb for the second sub, for the first sub, you do not want to borrow, and the carry flag may be undefined, so sbb could give the wrong result, it will only give the correct result if carry happens to be unset.

You divide the high dword and the low dword separately... Let's say you want to do this:

c = a/b;

c is 64 bit, a is 64 bit, and b is 32 bit.

We can write a and c as 2 dwords:

a.hi*(2^32) + a.lo

Now, we fill that in:

c = (a.hi*(2^32) + a.lo) / b

c = (a.hi*(2^32)) / b + a.lo / b

a.hi*(2^32) / b == (a.hi/b) * (2^32)

So c.hi == (a.hi/b)

This is only true if you assume fractions however. Since we have integers, we also have a remainder. This remainder should still be processed with a.lo.

Note that the remainder also has a 2^32 factor in it, and a.lo is < 2^32 by definition. So we can simply use the remainder as the high dword now. The formed number cannot give a division overflow by definition, because the remainder must be smaller than b.

So to calc c.lo, we get (rem*(2^32) + a.lo)/b

In code, something like this:

```
```

mov eax, [a.hi]

mov edx, 0

div [b]

mov [c.hi], eax

mov eax, [a.lo] ; note that the remainder is already in edx!!

div [b]

mov [c.lo], eax ; actual remainder of complete division is now in edx

You can expand this routine to work on numbers of any size, divided by a 32 bit number.

PS: you should only use sbb for the second sub, for the first sub, you do not want to borrow, and the carry flag may be undefined, so sbb could give the wrong result, it will only give the correct result if carry happens to be unset.

how can i reach the decimal answer of the division

Is this only for display purposes, or for further processing, or both? What accuracy do you need in either case?
The FPU can handle QWORD integers in memory. That would be a lot easier and faster than doing long divisions "by hand". Let me know if you need more help for that route.

Raymond

It's for both. for displaying i need an accuracy of six decimals (doesent need to be rounded off, just truncated) and in further processing as accurat as possible =)! all help mostly appreciated!

I guess double precision floats (doubles) are most accurate here, because the divisor is relatively large compared to the dividend, meaning you get a rather small quotient, and possibly a large fraction...

Then again, even with integer, you should have plenty of precision for most stuff, it depends on what you want to do.

a floating point division would go like this by the way (assuming the same a, b, c 64 and 32 bit numbers as in the previous example):

So quite simple really, load one operand on FPU stack, divide by memory operand directly, store and pop result to destination (for floating point numbers, remove the 'i' from the mnemonic... It stands for 'integer' (fld, fdiv, fstp)).

Printing a decimal number is done by dividing out digit for digit... You can get the lowest digit by getting the remainder of a division by 10 (for decimal). Convert that number to ASCII (add '0', or 30h to it), store it, and move to the next one (divide the number by 10 to remove the lowest digit, then again take the remainder), until you have 0 left.

Then again, even with integer, you should have plenty of precision for most stuff, it depends on what you want to do.

a floating point division would go like this by the way (assuming the same a, b, c 64 and 32 bit numbers as in the previous example):

```
```

fild [a]

fidiv [b]

fistp [c]

So quite simple really, load one operand on FPU stack, divide by memory operand directly, store and pop result to destination (for floating point numbers, remove the 'i' from the mnemonic... It stands for 'integer' (fld, fdiv, fstp)).

Printing a decimal number is done by dividing out digit for digit... You can get the lowest digit by getting the remainder of a division by 10 (for decimal). Convert that number to ASCII (add '0', or 30h to it), store it, and move to the next one (divide the number by 10 to remove the lowest digit, then again take the remainder), until you have 0 left.

How about the following:

Just ask if you need more help.

Raymond

```
.data
```

freq label qword

freqL dd ?

freqH dd ?

result dt ? ;extended double-precision

buffer db 32 dup(?) ;for string output

.code

CPUID

RDTSC

push edx

push eax

invoke Sleep,1000

CPUID

RDTSC

pop ecx

sub eax,ecx

pop ecx

sbb edx,ecx

mov freqL,eax

mov freqH,edx

...

start_timer:

CPUID

RDTSC

push edx

push eax

jmp rerun

;rerun:

;invoke DefWindowProc,hWnd,uMsg,wParam,lParam

stopp_timer:

CPUID

RDTSC

pop ecx

sub eax,ecx

pop ecx

sbb edx,ecx

push edx ;store on stack, H.O. 1st

push eax ; L.O. 2nd

finit

fild qword ptr[esp] ;load qword from stack

fild freq ;loads the frequency (qword)

fdiv

add esp,8 ;restore stack

invoke FpuFLtoA,0,6,ADDR buffer,SRC1_FPU or SRC2_DIMM

fstp result

...

The result stored in extended double-precision (REAL10) format will be the most precise avilable. The converted result to ascii will be with 6 decimal places. The FpuFLtoA function is part of the FPULIB available with the MASM32 package; that library of FPU functions is also available in some other thread of this forum if you don't have it.
Just ask if you need more help.

Raymond

It is?!:eek:

dunno sanscrit (just copied the letters of some chart i found through google), can you translate it correctly for me?

The main problem lies in -sl- and -gg-, it is difficult to translate for me.

If you are really interested, ask someone who know Hindi, they use the same script.

Thank you Raymond!! worked like a charm! been reading through chapter 14 in AoA to make shure i understand it properly aswell :alright:.

ok, i have a uncle who translates sanscrit scripts, been meening to check it with him since i got the avatar (bout two yrs ago =). but havent got around to it yet, and right now hes in India!

ok, i have a uncle who translates sanscrit scripts, been meening to check it with him since i got the avatar (bout two yrs ago =). but havent got around to it yet, and right now hes in India!

sluggo

Glad to hear that the suggested code worked to your satisfaction. There's always a risk that such "untested" code snipets may have shortcomings when written out of context.

I've got Randy's AoA in PDF format but could not find any Chap.14. Which part do you not yet understand: the use of the RDTSC or the use of the FPU instructions.

Raymond

Glad to hear that the suggested code worked to your satisfaction. There's always a risk that such "untested" code snipets may have shortcomings when written out of context.

I've got Randy's AoA in PDF format but could not find any Chap.14. Which part do you not yet understand: the use of the RDTSC or the use of the FPU instructions.

Raymond

here is an online

http://www.aczone.com/itrans/online/

see the .gif attached

i typed in sluggo

it gave me this

;)

**transliterator**for most of the indian languageshttp://www.aczone.com/itrans/online/

see the .gif attached

i typed in sluggo

it gave me this

;)

Sorry for these offtopics...

Wow

it's nice transliterator, even though it seems it's a bit buggy ('MA' at the beginning is translated as 'AM', for instance). The transliteration is AFAIK probably correct. There is one problem: "l" in s"l"uggo may be in this case both a vowel and a consonant, and I think some automatic translator can't choose always the best eventuality.

;)

Wow

**bluffer**,it's nice transliterator, even though it seems it's a bit buggy ('MA' at the beginning is translated as 'AM', for instance). The transliteration is AFAIK probably correct. There is one problem: "l" in s"l"uggo may be in this case both a vowel and a consonant, and I think some automatic translator can't choose always the best eventuality.

;)

@Raymond it was the FPU instructions, havent worket with it at all really, but i got the hard copy (http://webster.cs.ucr.edu/Page_asm/ArtofAssembly/0_HardCopy.html), the chapters are diffrent in that one. number 14 is all about floating point arithmetics!

@bluffer thank you, ive got a new avatar =)!

@bluffer thank you, ive got a new avatar =)!