Does anybody know a code or program for comparing execution speed.

I need to know which code is working faster.

I tried GetTickCount and GetSystemTime but millisecond is also a long time for some code parts.

Btw, i dont prefer to put the code in a thousands of loop. It uses global memory allocation. But i need to test the speeds compare for better coding.

Did anybody needed such a thing or a suggestion?
Posted on 2004-01-18 20:47:22 by cakmak
The RDTSC Pentium Instruction or QuerryPerformanceCounter API can be used for this also ...
I think there are many examples on this forum, use the Search feature
Posted on 2004-01-18 21:26:38 by BogdanOntanu
Thank you very much, i will out but here, i couldnt find QuerryPerformanceCounter :grin:
Posted on 2004-01-18 21:59:32 by cakmak
Try doing a board search for "profile". And/or get a copy of intel's VTune (there should be evaluation downloads from intel.com?).

What's "global memory"?
Posted on 2004-01-19 00:27:53 by f0dder

What's "global memory"?


GlobalAlloc
Posted on 2004-01-19 00:45:54 by donkey
Repeat after me: There's no such thing as global memory, and GlobalAlloc is bad ^_^
Posted on 2004-01-19 04:51:07 by f0dder
GlobalAlloc is bad ^_^

:) nothing matter. It is only an example of dynamic memory allocation. I suppose you prefer HeapAlloc. Thank you for suggestion but VTune is not working on w98(latest release). But i will try "profile" for asm. Anyway, this time i was working on a C++ code. And i found this,


2. CPUID/RDTSC

You have propably heard about the assembler instruction CPUID and RDTSC, which are supported by VC++ 6 and higher, but if you have an older compiler you can make use of the __emit function. This is a pseudo-asm instruction that lets you insert bytes directly into the outputted .exe. So, with a compiler that doesn't know the cpuid/rdtsc instuctions we can just use a little macro to put their respective opcodes directly as binary values into our program:


#define rdtsc __asm __emit 0fh __asm __emit 031h
#define cpuid __asm __emit 0fh __asm __emit 0a2h


3. Exact time measuring

If you want to measure something short, like the execution time of a function, or compute the framerate of your game as exact as possible, you have several ways to do this. One good way, esp. in a win32 app, is to use QueryPerformanceCounter.
But with a little bit of assembler and the earlier mentioned RTDSC (Real Time Stamp Counter )instruction it is possible to get the current cpu cyclecount, so if we use the following before and after the execution of a function, we know how many clockcycles it needed.
To do this we can use the following function:

__int64 GetCPUCount ( unsigned int loword, unsigned int hiword )
{
_asm
{
_emit 0x0f // insert rtdsc opcode
_emit 0x31
mov hiword , edx
mov loword , eax
}
return ( (__int64) hiword << 32 ) + loword;
}

Now we can get the number of clockcycles a function needs with something like

unsigned int hi = 0, lo = 0;
double t = GetCPUCount ( lo, hi );
MyFunction ();
double CycleCount = GetCPUCount ( lo, hi ) - t;

But keep in mind that you can never have the EXACT cycle count the function used! After all we're working on a multitasking/multithreading OS, so it's not guaranteed that windows will not use some clockcycles up during the execution of our program.
If you really want to get an exact count you can first set your program's priority level higher. Use SetPriorityClass ( HandleToOurProcess, REALTIME_PRIORITY_CLASS );
But be careful with that, windows might not like it and you won't be able to use the mouse as long as the prog runs or until the priority level gets set back to normal.
not very good, every time giving different results but can be see the average value easily.

Nice days
Posted on 2004-01-19 20:58:32 by cakmak
use CPUID to force the processor to finish what it is doing before taking a measurement.
__int64 GetCPUCount ( unsigned int loword, unsigned int hiword )

{
_asm
{
xor eax, eax
_emit 0x0f // insert cpuid opcode
_emit 0xa2
_emit 0x0f // insert rtdsc opcode
_emit 0x31
mov hiword , edx
mov loword , eax
}
return ( (__int64) hiword << 32 ) + loword;
}
The instructions surrounding CPUID/RTDSC can still pair with other instructions to create inaccuracies.
Posted on 2004-01-19 21:42:26 by bitRAKE
Yes, you are right. Thank you for info.

Nice days
Posted on 2004-01-20 17:56:12 by cakmak