Here is a fast algorithm to convert a buffer to hexadecimal text. It'd be great for a hexeditor. On my Athlon it runs as fast as a memory copy (256mb/sec). Anyone have any pointers for speeding it up? :P I'm trying to create a library of MMX functions for text. These algorithms will have rare usage, but it'll give some more practice with MMX. If anyone has any ideas of some text algorithms that would benifit from MMX, let me know.

UINT64 mmZero    = 0x3030303030303030;
UINT64 mmNine    = 0x3939393939393939;
UINT64 mmDiffNeg = 0xF8F8F8F8F8F8F8F8; //0xD8D8D8D8D8D8D8D8 // for lowercase letters

// 256 mb/sec!
void Bin2HexMMX(void *dst, void *src, int bytes)
{
__asm {
;	int 3
mov esi, src
mov edi, dst
mov ecx, bytes
movq mm5, mmZero
movq mm6, mmNine
movq mm7, mmDiffNeg
shr ecx, 3

loop16:
movq mm0,
movq mm1, mm0
psrlq mm0, 4

pand mm1, mm4
pand mm0, mm4
movq mm3, mm1
movq mm2, mm0
pcmpgtb mm3, mm6
pcmpgtb mm2, mm6
psubusb mm3, mm7
psubusb mm2, mm7
movq mm2, mm0
punpckhbw mm0, mm1
punpcklbw mm2, mm1
movntq , mm0
movntq , mm2

dec ecx
jg loop16

emms
}
}
This message was edited by bitRAKE, on 6/22/2001 10:59:41 AM
Posted on 2001-06-22 10:41:00 by bitRAKE
I guess mmx has no substitute for this famous coup: add al,90h daa adc al,40h daa which turns a nibble into a hex digit. But could the following be adapted? sub al,10 ;al is 0-0Fh sbb ah,ah and ah,7 add al,"A" sub al,ah The instruction sbb ah,ah is equivalent to cbw in this code. Does mmx have a sign-extend capacity like this? Nice website bitRAKE, particularly the gifs.
Posted on 2001-06-23 05:59:00 by Larry Hammick
Yeah, I thought the same thing. I haven't been able to come up with anything other than the saturated subtraction. I'll keep looking though - I have some other text algorithms that I'd like to do in MMX. I did think that I could do this faster, but I am new to MMX. The algorithm is very close to the second one you list above, but a couple instructions are used to order the bytes afterwards and the signs are gathered in a comparison which sets bytes based on the comparison (it's like a SET and a CMP in one instruction!) Thanks for the kind words. There is a great deal I have to add to the site. Mainly I wanted to help out some web newbies with some helpful links, and provide myself with an online favorites section to access from work/library/friend's house. Over time I'll get my poetry and code up. Larry, you can't get to your website by clicking on the little house icon in your messages! Need to add http:// to your page address in your profile. As you know from my messages, I've been there before - I just forgot to tell you earlier. This message was edited by bitRAKE, on 6/24/2001 12:51:54 AM
Posted on 2001-06-23 23:35:00 by bitRAKE
BitRAKE do you know of any good tutorials, articles, links, etc to learn mmx commands. I can't find a basic introduction. Great routine though! Thanks
Posted on 2001-06-25 19:23:00 by Zadkiel
You tried searching google? That's what I did. I searched for individual instructions like 'psubusb'. There is so much info and algos! That's why I picked doing text algos, because there are so many graphics algos already in MMX. This message was edited by bitRAKE, on 6/25/2001 8:38:55 PM
Posted on 2001-06-25 20:38:00 by bitRAKE
Thanks BR, the little house thingee works now. Google is amazing.
Posted on 2001-06-25 21:08:00 by Larry Hammick
Rake, since you're on MMX topics, i have a general question... How do you mix FP and MMX instruction? i know that MMX has no instruction that square a number, but FP does. and both of MMX and FP share the same stack, inorder to use both , you have to have EMMS (which empty the mmx state). Do you think you can post me an example? ---by the way, Bin2Hex... what do you mean when you say BIN? ---Bin can mean alot of thing, in this context, i have no clue Zadkiel, check out: www.cpuid.com for mmx instruction set. no example, just the instruction set... ------------------------------------------------------------ 256MB/sec! hmm... that seems to be FAST!! HIEW <--- 300MB/sec! for opening file.. can you believe it!! i wonder what code they use... This message was edited by disease_2000, on 7/1/2001 5:33:59 AM
Posted on 2001-07-01 04:28:00 by disease_2000
The FPU and MMX share the same stack/registers, but the state is undefined when switching between the two. Mixing MMX and FP instruction is really slow because of the state changes between the two - that is why SSE2 came into being. What I mean by 'Bin' is binary data. I sort my convertion algos into directories: one that is 'to ASCII', another that is 'from ASCII', another that is 'Other'. This algo is in the 'to ASCII' folder and is called 'Bin2HexMMX.asm' I just make these up, and they are loosely coupled to the actual function. :) HIEW is a useful program. It's very helpful to have such a useful tool to be quick as well.
Posted on 2001-07-01 13:35:00 by bitRAKE
Rake, you're strange. you just busted my virtual gut. so, SSE2... hmmm... what CPU support SSE2? about hiew, what do you do mostly with that program? i mainly use it to view text file. ever since i have that program. viewing text is the ONLY thing that i used it MORE.
Posted on 2001-07-01 13:44:00 by disease_2000
SSE2 is on the Pentium III+, and Athlon2? I use HIEW only in DOS for the same thing, mostly. :) I am happy that you think it's funny. ;)
Posted on 2001-07-01 18:53:00 by bitRAKE
http://ulita.ms.mff.cuni.cz/pub/techdoc/ check it out Rake, Zadkiel. the MMX document is there.
Posted on 2001-07-03 16:44:00 by disease_2000
Wow! That's a nice set of docs! Thanks! :D
Posted on 2001-07-04 00:22:00 by bitRAKE