Qages
.data?
buffer01 byte 260 ?
if buffer01 had 259 character in it and i have already used it and wanted to clear this once (.data?) buffer..... will your code do it?
Or will a simple
mov buffer01 [0], 0
do the same thing.
Posted on 2002-05-13 16:55:25 by Qages
Big Buffer
<edit> -> 1 qword per cycle (NO!!)<edit>
.586
.MMX
.model flat
public ZeroMemory
ZeroMemory: push edi
mov edi, dword ptr [esp+12]
mov ecx, dword ptr [esp+8]
movq MM0, qword ptr [esp-8]
mov edx, ecx
pxor MM0, MM0
and edx, -256
je _32
add edi, edx
neg edx
_CopyBy256: movq [edi+edx], MM0
movq [edi+edx+8],MM0
movq [edi+edx+16],MM0
movq [edi+edx+24],MM0
movq [edi+edx+32],MM0
movq [edi+edx+40],MM0
movq [edi+edx+48],MM0
movq [edi+edx+56],MM0
movq [edi+edx+64],MM0
movq [edi+edx+72],MM0
movq [edi+edx+80],MM0
movq [edi+edx+88],MM0
movq [edi+edx+96],MM0
movq [edi+edx+104],MM0
movq [edi+edx+112],MM0
movq [edi+edx+120],MM0
movq [edi+edx+128],MM0
movq [edi+edx+136],MM0
movq [edi+edx+144],MM0
movq [edi+edx+152],MM0
movq [edi+edx+160],MM0
movq [edi+edx+168],MM0
movq [edi+edx+176],MM0
movq [edi+edx+184],MM0
movq [edi+edx+192],MM0
movq [edi+edx+200],MM0
movq [edi+edx+208],MM0
movq [edi+edx+216],MM0
movq [edi+edx+224],MM0
movq [edi+edx+232],MM0
movq [edi+edx+240],MM0
movq [edi+edx+248],MM0
add edx, 256
jne _CopyBy256
_32: mov edx, ecx
and edx, 255
and edx, -32
je _4
add edi, edx
neg edx
_CopyBy32: movq [edi+edx], MM0
movq [edi+edx+8],MM0
movq [edi+edx+16],MM0
movq [edi+edx+24],MM0
add edx, 32
jne _CopyBy32
_4: mov edx, ecx
and ecx, 31
shr ecx, 2
rep stosd
_1: mov ecx, edx
and ecx, 3
rep stosb
_Done: movq qword ptr [esp-8], MM0
pop edi
ret 8
end
<edit> -> 1 qword per cycle (NO!!)<edit>
Ha Ha Very Funny
Moving my dword is just as fast
due to pairing, movqs cannot pair.
Moving my dword is just as fast
due to pairing, movqs cannot pair.
i thought all MMX instructions were pairable, except write-after-write or read-after-writes and EMMS?
I tried Qages ZeroMemory and it works perfectly. I had big fun with it and showed me how this thing really work.... but I only have one problem and that i don't want to use PTR.
I want to ask dose bdjames or buliaNaza version work for the Intel 386 processor and the AMD.
Thanks
I want to ask dose bdjames or buliaNaza version work for the Intel 386 processor and the AMD.
Thanks
cmax, mmx came out with the pentium.
It turns out that using just rep movsd
with rep movsb is the fastest way on
a PII.
I do not know how to beat it. You could
use dma, but that is nasty. :(
It turns out that using just rep movsd
with rep movsb is the fastest way on
a PII.
I do not know how to beat it. You could
use dma, but that is nasty. :(