Here is a fast algorithm to convert a buffer to hexadecimal text. It'd be great for a hexeditor. On my Athlon it runs as fast as a memory copy (256mb/sec). Anyone have any pointers for speeding it up? :P I'm trying to create a library of MMX functions for text. These algorithms will have rare usage, but it'll give some more practice with MMX. If anyone has any ideas of some text algorithms that would benifit from MMX, let me know.
UINT64 mmMask_0F = 0x0F0F0F0F0F0F0F0F;
UINT64 mmZero = 0x3030303030303030;
UINT64 mmNine = 0x3939393939393939;
UINT64 mmDiffNeg = 0xF8F8F8F8F8F8F8F8; //0xD8D8D8D8D8D8D8D8 // for lowercase letters
// 256 mb/sec!
void Bin2HexMMX(void *dst, void *src, int bytes)
{
__asm {
; int 3
mov esi, src
mov edi, dst
mov ecx, bytes
movq mm4, mmMask_0F
movq mm5, mmZero
movq mm6, mmNine
movq mm7, mmDiffNeg
shr ecx, 3
loop16:
movq mm0,
movq mm1, mm0
psrlq mm0, 4
pand mm1, mm4
pand mm0, mm4
paddb mm1, mm5
paddb mm0, mm5
movq mm3, mm1
movq mm2, mm0
pcmpgtb mm3, mm6
pcmpgtb mm2, mm6
psubusb mm3, mm7
psubusb mm2, mm7
paddb mm1, mm3
paddb mm0, mm2
movq mm2, mm0
punpckhbw mm0, mm1
punpcklbw mm2, mm1
movntq , mm0
movntq , mm2
add esi, 8
add edi, 16
dec ecx
jg loop16
emms
}
}
This message was edited by bitRAKE, on 6/22/2001 10:59:41 AMI guess mmx has no substitute for this famous coup:
add al,90h
daa
adc al,40h
daa
which turns a nibble into a hex digit. But could the
following be adapted?
sub al,10 ;al is 0-0Fh
sbb ah,ah
and ah,7
add al,"A"
sub al,ah
The instruction sbb ah,ah is equivalent to cbw in this code.
Does mmx have a sign-extend capacity like this?
Nice website bitRAKE, particularly the gifs.
Yeah, I thought the same thing. I haven't been able to come up with anything other than the saturated subtraction. I'll keep looking though - I have some other text algorithms that I'd like to do in MMX. I did think that I could do this faster, but I am new to MMX. The algorithm is very close to the second one you list above, but a couple instructions are used to order the bytes afterwards and the signs are gathered in a comparison which sets bytes based on the comparison (it's like a SET and a CMP in one instruction!)
Thanks for the kind words. There is a great deal I have to add to the site. Mainly I wanted to help out some web newbies with some helpful links, and provide myself with an online favorites section to access from work/library/friend's house. Over time I'll get my poetry and code up.
Larry, you can't get to your website by clicking on the little house icon in your messages! Need to add http:// to your page address in your profile. As you know from my messages, I've been there before - I just forgot to tell you earlier.
This message was edited by bitRAKE, on 6/24/2001 12:51:54 AM
BitRAKE do you know of any good tutorials, articles, links, etc to learn mmx commands. I can't find a basic introduction.
Great routine though!
Thanks
You tried searching google? That's what I did. I searched for individual instructions like 'psubusb'. There is so much info and algos! That's why I picked doing text algos, because there are so many graphics algos already in MMX.
This message was edited by bitRAKE, on 6/25/2001 8:38:55 PM
Thanks BR, the little house thingee works now.
Google is amazing.
Rake, since you're on MMX topics, i have a general question...
How do you mix FP and MMX instruction? i know that MMX has no instruction
that square a number, but FP does.
and both of MMX and FP share the same stack, inorder to use both
, you have to have EMMS (which empty the mmx state). Do you think
you can post me an example?
---by the way, Bin2Hex... what do you mean when you say BIN?
---Bin can mean alot of thing, in this context, i have no clue
Zadkiel, check out: www.cpuid.com for mmx instruction set.
no example, just the instruction set...
------------------------------------------------------------
256MB/sec! hmm... that seems to be FAST!!
HIEW <--- 300MB/sec! for opening file.. can you believe it!!
i wonder what code they use...
This message was edited by disease_2000, on 7/1/2001 5:33:59 AM
The FPU and MMX share the same stack/registers, but the state is undefined when switching between the two. Mixing MMX and FP instruction is really slow because of the state changes between the two - that is why SSE2 came into being.
What I mean by 'Bin' is binary data. I sort my convertion algos into directories: one that is 'to ASCII', another that is 'from ASCII', another that is 'Other'. This algo is in the 'to ASCII' folder and is called 'Bin2HexMMX.asm' I just make these up, and they are loosely coupled to the actual function. :)
HIEW is a useful program. It's very helpful to have such a useful tool to be quick as well.
Rake, you're strange. you just busted my virtual gut.
so, SSE2... hmmm... what CPU support SSE2?
about hiew, what do you do mostly with that program?
i mainly use it to view text file. ever since i have that program.
viewing text is the ONLY thing that i used it MORE.
SSE2 is on the Pentium III+, and Athlon2?
I use HIEW only in DOS for the same thing, mostly. :)
I am happy that you think it's funny. ;)
http://ulita.ms.mff.cuni.cz/pub/techdoc/
check it out Rake, Zadkiel.
the MMX document is there.
Wow! That's a nice set of docs! Thanks! :D