I found a way how optimize Stepan's algo even more -
we don't need to compact "word-tetrada"s separatly - we can mix it and compact at once:
we don't need to compact "word-tetrada"s separatly - we can mix it and compact at once:
mmxbC9 dq 0C9C9C9C9C9C9C9C9h
mmxb39 dq '99999999'
mmxb07 dq 0707070707070707h;
<esi-string (upper or lower case);>eax-number
movq mm0,[esi]
movq mm1,mm0
paddb mm0,[mmxbC9]
pcmpgtb mm1,[mmxb39]
pandn mm1,[mmxb07]
paddb mm0,mm1
movq mm1,mm0
psllw mm0,12
psllw mm1,4
;! psrlw mm0,12
psrlw mm0,8 ;!
psrlw mm1,12
por mm0,mm1 ;! mix "word-tetrada"
packuswb mm0,mm0 ;compact it at once!
;! packuswb mm1,mm1 ;it is already in mm0 and 'll be puckted
;! psllq mm0,4
;! paddb mm0,mm1 ;no need all is already in mm0
movd eax,mm0
bswap eax
You only change:
paddb mm0,mm1
on
por mm0,mm1
In what the superiority?
paddb mm0,mm1
on
por mm0,mm1
In what the superiority?
Nexo, I think Svin missed a reply above.
Yes, I did, sorry :)
Polovnikov, foud this one first, the only por and methods differ :)
Polovnikov, foud this one first, the only por and methods differ :)