I need to have a function or opcode that does something similar to GetTickCount() or RDTSC, and to be fast. RDTSC, as it is written in Agner's tute, cannot be used in Windows, and I was really frightened after I used the RDTSC opcode to measure several instructions take thousands of cycles (seems the RDTSC is skipped in Protected mode, and after I modify eax, the stupid results come ):
;..
RDTSC
push eax
fld real4 ptr
fstp real4 ptr
RDTSC
pop ebx
sub eax,ebx
PrintDec eax ; here eax gives some frightening results.
;..
thanks in advance
;..
RDTSC
push eax
fld real4 ptr
fstp real4 ptr
RDTSC
pop ebx
sub eax,ebx
PrintDec eax ; here eax gives some frightening results.
;..
thanks in advance
RDTSC can be used in Windows programs : it returns the number of cycles for the start of the machine in eax:edx.
As ReadIoSys says, it can be used in windows.
However, it is not virtualised (like eax etc. are), so when a process switched RDTSC keeps on counting. So if there is a task switch inside your timed section, then the results are unreliable.
If the code is short, then the chances of a task switch (and thus an unreliable value from it) are much lower. The bigger the code, the more chance of a task switch.
Also run the benchmarking code several times, there may be cache issues you are hitting etc. on the first load.
Mirno
However, it is not virtualised (like eax etc. are), so when a process switched RDTSC keeps on counting. So if there is a task switch inside your timed section, then the results are unreliable.
If the code is short, then the chances of a task switch (and thus an unreliable value from it) are much lower. The bigger the code, the more chance of a task switch.
Also run the benchmarking code several times, there may be cache issues you are hitting etc. on the first load.
Mirno
Of course I bear that in mind! But there were several algos which when I run them, they produce result of 2000 or near, and I run the test.exe several (30 - 40) times and get the medium results. The frightening results come when I run a fxxx instruction - maybe the fpu makes some initialisation for these 2000 cycles, will check thoroughly
Hi Ultrano,
Have you seen this site? Ratch
http://cedar.intel.com/software/idap/media/pdf/rdtscpm1.pdf
Have you seen this site? Ratch
http://cedar.intel.com/software/idap/media/pdf/rdtscpm1.pdf
Ultrano, I just pick smallest time of several runs for exact time. Or I force minimum value to be returned 10 times to validate it is shortest time (sometimes processor does not return smallest time again!?).