I've made this function that determines the cpu speed (*somewhat* by myself :tongue: ) But the problem is that it always reports different speeds. It hangs around ~1696.49 mhz. But it's always slightly different. sometimes .48, sometimes .5

And if I have it returning the value in hz then Its always different...

1696494180, 1696497854, 1696493265 etc... just mild differences.

Is there anyway to get a 100% accurate solid result everytime? The code Im using to determine speed is in C++ (I'm just starting with ASM and I cant make the whole thing in ASM yet).

union ticksInt
__int32 i32[2];
__int64 i64;
inline __int64 GetTicks()
ticksInt a;
mov a.i32[0], eax
mov a.i32[4], edx
return a.i64;
unsigned int CPUSpeed()
__int64 timeStart, timeStop;
__int64 startTick, endTick;

__int64 overhead = GetTicks()-GetTicks();

timeStart = timeGetTime();
while( timeGetTime() == timeStart ) timeStart = timeGetTime();

timeStop = timeGetTime();
if ( (timeStop-timeStart) >= 1 )
startTick = GetTicks();
timeStart = timeStop;
timeStop = timeGetTime();
if ( (timeStop-timeStart) > 1000 )
endTick = GetTicks();

return (unsigned int)((endTick - startTick)+overhead);

Oh and I call this function from a real time high priority thread. so that chances of this function getting bothered is low.

Posted on 2003-01-28 14:25:56 by IFooBar
It's probably due to an inaccuracy in the timing, I doubt you'll ever be able to get 100% perfect results.
Posted on 2003-01-28 15:00:22 by Eóin
of course there is a way:

the larger the interval of measuring, the more accurate it will be. If you use 1000ms interval, you will get 0.1% inaccuracy. If you use 10sec interval, you'll get 0.01% inaccuracy, and so on... Make it 100 sec :P . And you must not rely that the interval of measuring was exactly 1000ms! instead, it could be 1002ms, so you should do this:

TestCPU proc
local eax2,edx2
local timetaken,Result

call GetTickCount
mov timetaken,eax
mov eax2,eax
mov edx2,edx
push 10000 ; milliseconds
call Sleep
call GetTickCount
sub eax,timetaken
mov timetaken,eax

sub eax,eax2
sub edx,edx2
mov ecx,timetaken
div ecx ; divide 64-bit with 32-bit, result is in eax
;mov Result,eax

TestCPU endp

The result is in kiloherz
if you want in herz, then multiply the 64-bit integer of eax&edx with 1000
This is the best way to test cpu speed.
Posted on 2003-01-31 22:25:30 by Ultrano
(created on the fly)

multiply 64-bit integer with 1000:
edx-high dword
eax-low dword

mov edx2,edx
mov ecx,1000
mul ecx
push eax
push edx
mov edx,edx2
mul ecx
pop edx
add edx,eax
pop eax
Posted on 2003-01-31 22:35:40 by Ultrano
just a note, generally your clock speed should be your bus speed * your clock multiplier, my mul is 10.5, and bus speed of 133mhz. so 1396.5 this is what it should be 1396500000hz exactly, but not everything is perfect, minute defects could effectivly speed it up or slow it down a few hertz. and there is really know way on pcs to measure a second to millions of digits of accuracy. but experiment with QueryPerformanceCounter and QueryPerformanceFrequency, these measure to a double of a precision of a length of time so while Sleep() may show 1000 an that api 1002 you can calculate off the diff, but as I said nothing is perfect.
Posted on 2003-02-01 21:29:18 by Qages
So in other words you guys are saying that its basically not possible to get 100% accurate results :( , because of many different factors. peh....I was kinda hoping that it is possible. So then is it ok to time the whole app using something like

#pragma warning(push)
__int64 gettick()
__asm rdtsc
#pragma warning(pop)

__int64 elapsedSecs = gettick() / cpuSpeed;

Or will using rdtsc with the ~cpu speed cause a lot of innacuracies eventually?

thanks for your replies so far
Posted on 2003-02-02 03:33:52 by IFooBar
Wont these methods for cpu speed detection fail completly in 'mobility' processors?

I mean a 1.7 ghz processor running at 1.2ghz would return 1.2ghz. So wont your program fail if it wants to know the 'actual' speed of the cpu?
Posted on 2003-02-04 01:27:06 by clippy
Hy, whould you test/correct any errors in my code?
i'm not sure if the fractionals are all right, and didn't got time for it

i have made it as an attachement

Posted on 2004-09-29 19:21:24 by >Matrix<
The CLI and STI instructions do not affect interrupts in any way when in V86 mode. They only cause a flag to be changed which tells the operating system whether it is to lead the program into an interrupt handler or not upon receiving interrupts that have been enabled by the running program.
Posted on 2004-09-30 11:56:14 by Sephiroth3
Hy, thank you for making it clear

Posted on 2004-09-30 12:03:29 by >Matrix<
Perhaps this discussion about measuring CPU speed could be of interest.
Posted on 2004-09-30 12:17:55 by Jibz
Sorry, no,
i need extreme accurate timings in some hardware related delays,
1 Hz @ 927MHz is enough for me for CPU frequency Measurement.
i will not call any windos functions to get the cpu frequency

Posted on 2004-09-30 12:32:35 by >Matrix<
Ah, sorry. I didn't realize you woke up some thread from half a year ago in order to post a question. I read the initial post and replied to that.
Posted on 2004-09-30 12:52:53 by Jibz
Well exactly it wasn't a question,
but it might had been a question which then i whould have answered. :)
But i wanted to post a cpu frequency measurement routine to a topic related to this, and i didn't wanted to create a topic again for my code.

Posted on 2004-09-30 13:03:43 by >Matrix<
heh, that goal is a bit optimistic, seeing as the crystal that drives the CPU clock is not gonna be any better than 20 ppm ;)
Just doing one long measurement will reduce errors, but they're still there. You also either waste time during init, or do it 'overlapped' and require the app to call your startup function early on (thus leaking implementation details). Why not take several smaller samples and choose from among those?
Here's my code to do so (in C++):

static void measure_cpu_freq()
// set max priority, to reduce interference while measuring.
int old_policy; static sched_param old_param; // (static => 0-init)
pthread_getschedparam(pthread_self(), &old_policy, &old_param);
static sched_param max_param;
max_param.sched_priority = sched_get_priority_max(SCHED_RR);
pthread_setschedparam(pthread_self(), SCHED_RR, &max_param);

// make sure the TSC is available, because we're going to
// measure actual CPU clocks per known time interval.
// counting loop iterations ("bogomips") is unreliable.
// note: no need to "warm up" cpuid - it will already have been
// called several times by the time this code is reached.
// (background: it's used in rdtsc() to serialize instruction flow;
// the first call is documented to be slower on Intel CPUs)

int num_samples = 16;
// if clock is low-res, do less samples so it doesn't take too long.
// balance measuring time (~ 10 ms) and accuracy (< 1 0/00 error -
// ok for using the TSC as a time reference)
if(timer_res() >= 1e-3)
num_samples = 8;
std::vector<double> samples(num_samples);

int i;
for(i = 0; i < num_samples; i++)
double dt;
i64 dc;
// i64 because VC6 can't convert u64 -> double,
// and we don't need all 64 bits.

// count # of clocks in max{1 tick, 1 ms}:
// .. wait for start of tick.
const double t0 = get_time();
u64 c1; double t1;
// note: get_time effectively has a long delay (up to 5 ?s)
// before returning the time. we call it before rdtsc to
// minimize the delay between actually sampling time / TSC,
// thus decreasing the chance for interference.
// (if unavoidable background activity, e.g. interrupts,
// delays the second reading, inaccuracy is introduced).
t1 = get_time();
c1 = rdtsc();
while(t1 == t0);
// .. wait until start of next tick and at least 1 ms elapsed.
const double t2 = get_time();
const u64 c2 = rdtsc();
dc = (i64)(c2 - c1);
dt = t2 - t1;
while(dt < 1e-3);

// .. freq = (delta_clocks) / (delta_seconds);
// cpuid/rdtsc/timer overhead is negligible.
const double freq = dc / dt;
samples[i] = freq;

std::sort(samples.begin(), samples.end());

// median filter (remove upper and lower 25% and average the rest).
// note: don't just take the lowest value! it could conceivably be
// too low, if background processing delays reading c1 (see above).
double sum = 0.0;
const int lo = num_samples/4, hi = 3*num_samples/4;
for(i = lo; i < hi; i++)
sum += samples[i];
cpu_freq = sum / (hi-lo);

// else: TSC not available, can't measure; cpu_freq remains unchanged.

// restore previous policy and priority.
pthread_setschedparam(pthread_self(), old_policy, &old_param);

Posted on 2004-10-01 22:15:04 by Jan Wassenberg
Hy, have you tested this code?

if yu do this you won't miss the zero?
while(t1 == t0);

no comment, 20ppm must be enough for me if not i will make external hardware.
so you think measureing tsc with pit is not a good idea?
it is working quite good under dos, and it misses 1 of 25 under windows with 72 ticks measurement time. It tells me HZ.

Posted on 2004-10-02 07:14:22 by >Matrix<
if yu do this you won't miss the zero?
while(t1 == t0);

What do you mean by zero? The purpose of that is to wait for the start of a time tick, since get_time may only be running at 10ms resolution (depending on timer hardware available). Otherwise, we might catch it right at the end of a tick, reporting 10ms elapsed when it's only say 1ms.

no comment, 20ppm must be enough for me if not i will make external hardware.

whoa, hard-core :) What's the application that requires so much precision?

so you think measureing tsc with pit is not a good idea?
it is working quite good under dos, and it misses 1 of 25 under windows with 72 ticks measurement time. It tells me HZ.

ah, I thought you were running Windows only? It's fine on DOS, where cli actually works and you can prevent just about all interference. On Windows, you may be able to access a better timer, which runs at 3x PIT freq (ACPI aka PM timer; maybe check Linux how to access it). You also need to somehow detect those 1 in 25 bad samples.
BTW, due to jittering (differs with temperature, e.g. if the computer has just been powered up - fun), you have a frequency of say 980..1020 MHz. It's pointless to try to measure down to 1 Hz ;) If you manage to get a stable value, it's because all errors and fluctuations are being averaged.
Posted on 2004-10-02 08:08:30 by Jan Wassenberg
An example: undersampling oscilloscope's time base is critical
less strict but also critical application:
some microcontroller's programming also needs to be accurate +-10% @ 100 us delay
another application:
vertical horizontal deflection of a laser beam to draw curves also need accurate timings ( and fast computer )

what if i can't do it on windows? nothing i'll buy another computer for that application and it will run under dos - or boot as an os.

Posted on 2004-10-02 09:41:03 by >Matrix<
ah, ok, for hard-realtime apps, definitely go for DOS.
Again, though, the last of your worries should be nailing CPU freq down to 0.001 ppm ;)
Posted on 2004-10-02 12:09:20 by Jan Wassenberg