How does one go about making a high precision timer??

I can't seem to find any examples of what Im looking for (probably because I'm using the wrong search parameters), but I just want a simple timer that you can calibrate and then use to get ~delays (10-20ns kinda accuracy).

RDTSC seems the way to go, but to do that I need to calculate the CPU frequency since RDTSC is clock ticks.

If I know how many clock ticks a piece of code takes to execute, and I know the frequency of the CPU (ie. the rate the clock ticks are generated) then I can build a good little timer (I hope).

If anyone has a good CPU clock speed detection routine, or better still, an ASM example of a high precision timer that works... any suggestions welcomed :)
Posted on 2004-02-25 18:44:40 by Sentient
I guess the title kind of threw you off. :tongue:
http://www.asmcommunity.net/board/index.php?topic=17364
Posted on 2004-02-25 19:34:09 by bitRAKE
See... its the problem with search terms

resolution vs precision

I apologise.. and Thanks :)
Posted on 2004-02-25 23:27:06 by Sentient
Cheers for the article.. unfortunately it doesnt help

All I need is to figure out how long one piece of code takes to run/loop though so I can use that same code as a delay function later by changing how many loops it does.

example:

code takes 3ms to run once.

to obtain a 30ms delay i would run the code 10x (this is an example.. i know its not right) :P


Knowing how many ticks it takes to execute is good enough, but I need a way to convert those ticks to time.
Posted on 2004-02-26 00:05:48 by Sentient
Righteo... problem solved.

I have setup the system to have ~50us, but I think I can make it better with a little tweaking.

I havent done a lot of investigation in device requirements for timing, but does anyone know if there is
anything common that requires better resolution than this?

I know that PIO mode on HDD's needs at least 400ns, which on my CPU would be just over 1 clock tick, so not really an issue (btw: XP2600+ works out to about 0.3us/333ns per tick)

EDIT:

I just had a thought while in the shower :)

If my cpu is running at ~2Ghz, shouldnt it generate ~2,000,000,000 clock ticks, rather than a lowly 3,000,000??

2,000,000,000 ticks/sec would equate to 0.5ns/tick
3,000,000 ticks/sec would equate to 333ns/tick

Quite a substancial difference as you see - 3mil ticks is 3Mhz, which to put it nicely is CRAP.
Posted on 2004-02-26 01:14:56 by Sentient
You should look into the hi-performance timer ... if ur lazy like me, Scronty already coded a generic Timer helper include, which he called "dxutil". Look on Scrontsoft.com for the file if interested. It's a transliteration of the DirectX SDK timer code.
Posted on 2004-02-26 01:35:29 by Homer
I apolgize for being an idiot.

I am using Bochs emulator and have it set to 3,000,000 instructions/sec.

I realize the error of my ways, and will prompt go and hang myself.
Posted on 2004-02-26 01:51:14 by Sentient
Well

I am afraid that Sentient is not expressing himself right:

IMHO he wants it for his own OS for a delay of 400ns specified in ATA UDMA standards (for HDD that is)

So any Windows based solution is useless for him (that is why this should have been in OS construction forum)
... but since nobody come with an better ideea there... i guess he did posted here...

Besides any Windows/OS based solution is not going to be able to precisely alow a 400ns delay/timer

AFAIK no other hardware inside PC has such a high speed resolution but the CPU itself (above 100Mhs that is)

think about this:
1Megahertz ---> 1 microsecond clock
100Megahertz --> 10 nanosecconds clock
1GigaHertz ---> 1 nanosecond clock
2Gigahertz ---> 500 pico seconds clock
3Gigahertz --> 333 picoseconds clock
4Gigahertzz --> 250 picoseconds clock

rarely one instruction takes only one clock, but with cache/pipeline/scalar unit this might be possible on well scheduled asm applications.

Unfortunately some CPUs will vary the CPU speed to save power and/or cool down

The only way to reliable detect CPU speed is to setup an 1 seccond timer in hardware by either using the PIT 8253 and IRQ0 or the CMOS RTC and them count how many RDTSC elapsed from start.
Posted on 2004-02-26 05:31:19 by BogdanOntanu
Thanks - thats exactly what I had done.

I just didnt understand why it was only reporting 3,000,000 ticks per second.

I have now solved that problem, and just need to convert the timing to 64-bit rather than just using the EAX from RDTSC.

I didn't realise this was forum was only for WinASM coding - I thought it was ASM in general. Ill pay more attention next time.
Posted on 2004-02-26 12:26:31 by Sentient
You need to write a device driver to handle the TIC (Intel 8253). You have to jam load counter with preset value at vatious intervals.
Posted on 2004-02-26 13:37:24 by mrgone
Could someone take a look at this code and see if they can figure out why it gives the wrong result.

The problem is:
When I run under boshs with a IP-rate of 2,000,000 the code works perfectly.
When I run it from a boot disk (fresh reboot), at the end of the code it correctly reports that
Timestamp3=3,000,000 (which is 3,000,000,000 div 1000), BUT Timestamp1 is just completely weird, and TimeStamp2 contains the result that should be in TimeStamp1.

This code is running in protected mode, with CS=flat 4gb segment, and DS/ES/FS/GS=flat 4gb data. The stack is also setup correctly.

I can only guess that it is some kind of problem with the DIV instruction's. It works fine in boshs, but on a real boot where the RDTSC difference is ~3,000,000,000 I don't seem to be able to divide by 1,000,000, although a DIV by 1000 works fine.

I know some of the code is unnecessary, but its there for the sake of making sure everything contains the correct values.


Restart:
MOV AL, 0
OUT 70h, AL
IN AL, 71h
AND EAX, 0Fh
MOV ECX, 3
@Loop2:
MOV EBX, EAX
DB 0Fh,31h ; RDTSC
MOV DWORD PTR , EAX
MOV DWORD PTR , EDX
MainLoop:
MOV AL, 0
OUT 70h, AL
IN AL, 71h
AND EAX, 0Fh
CMP EAX, EBX
JE MainLoop
LOOP @Loop2

DB 0Fh,31h ; RDTSC
SUB EAX, DWORD PTR
SBB EDX, DWORD PTR

MOV ECX, 1000
DIV ECX
MOV DWORD PTR , EAX

MOV EAX, DWORD PTR
MOV EDX, 0
MOV ECX, 1000
DIV ECX

MOV DWORD PTR , EAX
MOV DWORD PTR , EDX

MOV EDI, 0B8000h
MOV EAX,
CALL CRT_PrintDec STDCALL, EAX
MOV EDI, 0B80A0h
MOV EAX,
CALL CRT_PrintDec STDCALL, EAX
MOV EDI, 0B8140h
MOV EAX,
CALL CRT_PrintDec STDCALL, EAX

JMP Restart

TimeStamp1 DD 0
TimeStamp2 DD 0
TimeStamp3 DD 0
Posted on 2004-02-27 01:32:43 by Sentient
BTW:

The code basically samples the seconds field of the RTC, and loops until the seconds increment by one. It then resamples RDTSC and calculates the difference - then divides it by 1,000,000 to get the 'ticks per microsecond'.

When it divides by 1000 (the first time) it returns the correct results.

When it divides by 1000 (the second time) it returns the correct result in Boshs (with sim-inst rate of 2,000,000) but returns incorrect results in a boot from floppy (inst rate around 4,000,000,000).

After first thinking it was an unsigned/signed conflict of some kind, I now realise it cant be because the first DIV works correct on the larger number, with the result being divided by 1000 again being the wrong result.

Hopefully its something obvious to someone.

PS: I have an AMD Athlon XP3000+ - just in case there is an AMD/Athlon problem I'm missing
Posted on 2004-02-27 01:37:33 by Sentient
Well:

I see no error at first glance stuff, just some observations:

1)RDTSC is NOT a serializing instruction so maybe use a CPUID in front ;) Otherwise some code will execute out of Sync with RDTSC ... only on a real machine because i think Boschs will not emulate "out-of-order" speculative execution of some CPUs

2)Why loop using ECX 3 times? after the first pass IMHO the value in EBX will get corupted?

3)Why divide 2 times by 1000 ? instead only once by 1.000.000 decimal. Do you have an Exception for DIV errors in place in your OS? (ok i do not :tongue:) because IF the result of division is greater than EAX and exception will fire up... and this is pretty possible dividing by only 1.000 decimal

4) Why move to and the imediately read it back? Is it some kind of testing/paranoia (i know i do that myself sometimes ... but still ... ) I noticed you say that some code is indeed unnecessary BUT in debugging any extra instruction could be the cause of the problems... so streamlineing is better IMHO

5)I would rather use IRQ-0 and count 18 IRQs or reprogram the PIT to do 100 IRQs/seccond and read 100 of them or use the PIT itself... rather than reading RTC. RTC is not that accurate or fast... but i agree that it SHOULD work.
Posted on 2004-02-27 03:56:17 by BogdanOntanu
Sentient ,

Are you following the guidelines in this document? Ratch http://www.math.uwaterloo.ca/~jamuir/rdtscpm1.pdf
Posted on 2004-02-27 04:22:57 by Ratch
Your using the RTC. Unfortunately the real time clock is not accurate in seconds. You can observe this for yourself. Take a good swiss movement watch and compare the second hand to the second hand on the PC clock on the task bar. You'll notice that before the minute is out the second hand will make a radical jump. This is due to the system timer ISR not being written for accurate seconds. What they do is at some particular interval they issue a correction to the timer. What I'm saying is it could be better. This must be done at the TIC level. I would start be downloading the spec sheets on the TIC from Intel I beleive right off hand it is the 8253. It has three internal timers. I think only two are used. I am just swamped with work right now and would love to help you but only have time to point out the discrepancies. You must right an ISR that can jam load one of the counters at various intervals in order to acheive acuarate seconds and portions thereof. I can give you this example using a PIC so you can see a possible solution. This routine keeps one or two rolling counters in memory telling the ISR when to jam load the counter with various presets depending on crystal frequency alias and stray capacitance in order to acheive a software adjustable accurate clock. The techniques can be applied to the PC timer. Beleive me on the PC RTC your only accurate within a minute.

Yeah Bogdan, me too
Posted on 2004-02-27 10:51:18 by mrgone
BogdanOntanu

I tried your CPUID suggestion and at first glance it seems to cause more problems than before so Ill investigate it more throughly later.

I loop 3 times, because the first time the RTC seconds may already be part way through a second, so an 'accurate' result will not be available until at least the 2nd loop. EBX is not corrupted because it gets reloaded with the RTC second's that are sampled every loop while its waiting for the seconds to change.

It simply loops and compares EAX (the current RTC seconds) to EBX (the saved seconds) until they are different, then it calculates RDTSC difference, and loops back (until ECX=0). When it loops back, EBX is reloaded with the last sampled seconds (which is in EAX).

The reason I was dividing by 1000, and then again is because when i divided by 1,000,000 it was returning incorrect results. When I divided by 1000, it was correct. If I divided again by 1000 it was wrong.

The reload was just to confirm to myself that EDX:EAX contained the correct values - I was originally not doing this and still getting errors.

I have the PIT reprogrammed to 100Hz, but for simplicity I used the RTC - If I cant get it to work, Ill try using the PIT later.
Posted on 2004-02-27 13:51:40 by Sentient