Many will know the Intel document from 1998 on how to use the RDTSC instruction: rdtscpm1.pdf. On p.5, they provide the following code :



unsigned time, time_low, time_high;
unsigned mhz = 150000000; // 150 MHz processor

__asm rdtsc // Read time stamp to EAX
__asm mov time_low, eax
__asm mov time_high, edx

Sleep (35000); // Sleep for 2 seconds

__asm rdtsc
__asm sub eax, time_low // Find the difference
__asm sub edx, time_high
__asm div mhz // Unsigned divide EDX:EAX by mhz
__asm mov time, eax

printf("Seconds: %u\n", time);


The line "sub edx, time_high" looks wrong to me. In my view, they would need to use SBB instead of SUB. That is because "sub eax, time_low" may well subtract a bigger number from a smaller one, resulting in a "borrow". If so, then the borrow needs to be taken care of (see the example below). Right? Wrong? "Known bug"?

Regards, Frank


---------------------------------------------------------------
EXAMPLE
Assume that our first measure returns the 64-bit timestamp EDX:EAX = 00000000:FFFFFFFFh, and our second measure returns the 64-bit timestamp EDX:EAX = 00000001:00000001h. Now let's follow Intel's algorithm.

Step 1: Subtracting the low DWORDs (new minus old)
New EAX = 000000001h, old EAX = 0FFFFFFFFh, new EAX - old EAX = 000000002h, "borrow" has taken place
After this step, EDX:EAX = 00000001:00000002h, CF = 1.

Step 2: Subtracting the high DWORDs (new minus old)
New EDX = 000000001h, old EDX = 000000000h, new EDX - old EDX = 000000001h
After this step, EDX:EAX = 00000001:00000002h.

Alternative Step 2: Using SBB instead of SUB
Recall that from Step 1, CF = 1
New EDX = 000000001h, old EDX = 000000000h, new EDX - old EDX - CF = 000000000h
After this step, EDX:EAX = 00000000:00000002h.
Posted on 2003-12-05 12:14:19 by Frank
Yes, you're right, they goofed :)
Second sub should be sbb.

By the way, if you want to use this for a timer, be careful... Some CPUs (especially laptops, but also P4 and Athlon64) can dynamically alter their clock frequency.
So while this is an excellent way to do relative clockcycle counts, it could malfunction when used as an absolute timer (common pitfall).
For such a timer, I would recommend using timeGetTime(), the high-resolution timer from winmm.dll.
You can control the granularity with timeBeginPeriod()/timeEndPeriod().
Posted on 2003-12-05 12:28:21 by Bruce-li

Yes, you're right, they goofed :)
Second sub should be sbb.

Thought so ... someone should tell them about it.

By the way, if you want to use this for a timer, be careful... Some CPUs (especially laptops, but also P4 and Athlon64) can dynamically alter their clock frequency.

My (passive) timers rely on the QueryPerformanceCounter function. Would you recommend that function, or is it possibly based on RDTSC anyway?

Regards, Frank
Posted on 2003-12-05 13:04:08 by Frank
QueryPerformanceCounter() is usually a good alternative yes, although it is a chipset feature (so not rdtsc-based), which is not available on every PC (you can assume that pretty much every normal Pentium PC and later has it, but for very old PCs or embedded systems/PocketPC etc, it may not be available).
Note also that some chipsets support this feature, but have a buggy implementation, I believe a certain Intel chipset (was that the i815 or something?) reports 'random' values when the bus load is too high.

I forgot to mention why I recommended timeGetTime(), because I forgot about these issues :)
Thanks for reminding me. Hope this information has been of help. And you may want to look up the exact chipset(s) with issues, as it's been a while since I've dealt with it.
Posted on 2003-12-05 15:58:19 by Bruce-li
I prefer QPC myself.

Heads up from MSDN -> http://support.microsoft.com/default.aspx?scid=kb;en-us;Q274323
Posted on 2003-12-06 09:16:48 by alpha
Chipset issues -- that might explain the weird data from one of the lab computers. Thank you for the pointers, Bruce and alpha!
Posted on 2003-12-08 13:56:45 by Frank