Hi!
I known that for Pentium it's better to write your own loop
instead of using rep stos , etc.
But I couldn't fnd anytning regarding perfomance hit when rep prefix is used on PII/PIII. Did anybody check it out?
Posted on 2002-06-29 11:29:19 by Sergo
String instructions are fine with the repeat prefix on PPRO+
Posted on 2002-06-29 11:51:02 by comrade
Sergo,

On later pentiums, there is a special case for both MOVS and STOS using REP so you have no problems using either.

Generally its a good idea to write your own loops for mor complex code that block move of fill but in these two instances, the REP MOVS/STOS works fine and is usually faster than a manually written loop.

Regards,

hutch@movsd.com
Posted on 2002-06-30 00:01:42 by hutch--
I once actually tested this, using rep, normal movs, and mmx movs.
If you know the data is aligned, the MMX mov rules pretty much!
DWORD movs in a loop will beat the rep movsd if the value of ecx is less than the size of the processor cache, if it is greater, the rep movsd is special cased in the processor, and gets to transfer levels roughly equivelent to that of the MMX movs.

Having said that, if it is smaller than the size of the processor cache, then the actual time lost is minimal because the data is so small anyway! The only time it could come in to play is if you repeatedly copied small buffers around memory.

Mirno
Posted on 2002-06-30 04:10:26 by Mirno