Hi to all GFX speed warriors :) I need to calculate an alpha channel, which does this formula on a source / dest RGB Pixel with an alpha value: dest = (alpha * dest) + ((1 - alpha) * source) where source and dest are three bytes RGB and alpha is a byte from 0 .. 255, which means: 0 .. no transparency (dest = source) 255 .. 100% transparency (dest = dest) I made a precalc table, which holds all combinations of the multiply of two bytes, from 0 * 0 to 255 * 255, stored as byte (SHR 8). My implementation of the task looks like this:
	mov	edx, lpLookup
	mov	esi, lpSource
	mov	edi, lpDest

	xor	eax, eax
	mov	ebx, eax
	mov	ah, 		; alpha channel
	not	ah
	mov	bh, 255			; 1 - alpha
	sub	bh, ah

	mov	al, 
	mov	bl, 			; color red
	mov	al, 
	add	al, 
	mov	, al

	mov	al, 
	mov	bl, 		; color green
	mov	al, 
	add	al, 
	mov	, al

	mov	al, 
	mov	bl, 		; color blue
	mov	al, 
	add	al, 
	mov	, al

this code has to be calculated for each Pixel, so it would be nice to optimize it a little bit. This message was edited by beaster, on 7/6/2001 5:23:58 AM
Posted on 2001-07-06 05:16:00 by beaster
Two things about the following code:

   xor   eax, eax
   mov   ebx, eax
   mov   ah,       ; alpha channel
   not   ah
   mov   bh, 255         ; 1 - alpha
   sub   bh, ah
mov ebx, eax: This is obviously zeroing ebx, but it is not special cased in the PII/PIII/P4 processors, and so will cause a partail register stall later in the "mov bh, 255". Changing to "xor ebx, ebx" will give the same result, but avoid the partial register stall. Second point, "255 - not(p) = p"! So the code can be shortened to:

   xor   eax, eax
   xor   ebx, ebx
   mov   ah,       ; alpha channel
   mov   bh, ah
   not   ah
And you will still end up with the same result! If you don't need the alpha in ah, then you can obviously shorten it even more.
Posted on 2001-07-06 05:41:00 by Mirno