This is my code, how is yours? :grin:
This code assumes the color is in 00RRGGBB format (or 00BBGGRR, that doesn't matter). I have to preserve ebx,ecx,esi, and edi because of the loop this code is in.
rol edx,16
ror eax,8
and eax,0ffff00ffh
add dx,ax
shr edx,24
add ax,dx
and eax,0ffffh
xor edx,edx
push ebx
mov ebx,3
div ebx
pop ebx
This code assumes the color is in 00RRGGBB format (or 00BBGGRR, that doesn't matter). I have to preserve ebx,ecx,esi, and edi because of the loop this code is in.
_DATA SEGMENT
; works for all numbers lessthan 2^31
DIV3MUL DWORD 01010101010101010101010101010110y
_DATA ENDS
mov edx, eax
ror eax, 8
and edx, 00FF00FFh
and eax, 000000FFh
add eax, edx
mov edx, eax
ror eax, 16
add eax, edx
and eax, 0FFFFh
mul DIV3MUL
; Answer in EDX
It is important to not switch between 32-bit and 16-bit register usage.Maybe this is better?
eyes seeing primary colors in different bais (ie green is very strong).
mov edx,eax
shr eax,16 ; assumes top byte is zero
add dl,dh
adc ah,0
add al,dl
adc ah,0
mul DIV3MUL
; Answer in EDX
I'm sure you know that this is not truely the brightness due to human
eyes seeing primary colors in different bais (ie green is very strong).
1 register add lower 3 bytes together:
But you'll still have to waste edx if you want to use the DIV3MUL thingy.
I don't know if MUL mem32 is faster or slower than MUL reg32.
[size=12]
mov eax, dwRGB
add ah, al
setc al
bswap eax
shr eax, 8
add ah, al
sbb al, al
shr eax, 8
adc ah, 0
[/size]
But you'll still have to waste edx if you want to use the DIV3MUL thingy.
I don't know if MUL mem32 is faster or slower than MUL reg32.
[size=12]
mov edx, 55555556h ;5555556 ~= 2??/3
mul edx ;edx ~= (rgb * (2^??/3)) / 2??
[/size]
Hmm... at least I have some interesting replies. Thanks all!
I'll tell a bit more about the program this was for, if you're interested. I'm currently working on a very nice transparent (UpdateLayeredWindow, so XP/2K only) user interface thingie, that uses a png with alpha channel that is 3 by 3 "tiles" (32x32 pixels each) large to paint an entire image. It looks nice with the partial transparency and all, but I wanted to be able to print opaque text to it. That's how it started...
This is how I currently do it:
1. Paint black-on-white text on tmpDC
2. Process the painted block pixel by pixel, and if the inverse (not'ed) brightness of a pixel is larger than the window's transparency at that point, the transparency is replaced with the not'ed brightness.
3. BitBlt the window part that the text will be painted on to tmpDC
4. Draw the text with transparent (SetBkMode,tmpDC,TRANSPARENT) background to tmpDC
5. Copy the pixel block from tmpDC back to the main window DC, without overwriting the alpha values.
This works ok if your window isn't too transparent, otherwise ClearType screws up and makes the edges look really ugly.
bitRAKE, I know this isn't the real brightness, but it's good enough for my purpose since most of the pixels are gray anyway (except for the ClearType ones of course).
I'll tell a bit more about the program this was for, if you're interested. I'm currently working on a very nice transparent (UpdateLayeredWindow, so XP/2K only) user interface thingie, that uses a png with alpha channel that is 3 by 3 "tiles" (32x32 pixels each) large to paint an entire image. It looks nice with the partial transparency and all, but I wanted to be able to print opaque text to it. That's how it started...
This is how I currently do it:
1. Paint black-on-white text on tmpDC
2. Process the painted block pixel by pixel, and if the inverse (not'ed) brightness of a pixel is larger than the window's transparency at that point, the transparency is replaced with the not'ed brightness.
3. BitBlt the window part that the text will be painted on to tmpDC
4. Draw the text with transparent (SetBkMode,tmpDC,TRANSPARENT) background to tmpDC
5. Copy the pixel block from tmpDC back to the main window DC, without overwriting the alpha values.
This works ok if your window isn't too transparent, otherwise ClearType screws up and makes the edges look really ugly.
bitRAKE, I know this isn't the real brightness, but it's good enough for my purpose since most of the pixels are gray anyway (except for the ClearType ones of course).
That is a very impressive looking window effect! Maybe MMX would be a better solution - how we love to drag the pretty windws around. :)
It already moves quite smoothly (I have show window contents while dragging enabled). In fact I haven't been able to make it flicker, but that's pretty logical when you realize the UpdateLayeredWindow and the painting is only done once in the program and windows takes care of the rest :)
Perhaps if I have multiple windows moving on a timer... will try it out one time.
Also, another question: I've just dusted off your Fast Alpha Blend algo, and I'm wondering what it does with the alpha values of the destination? For my purpose it'd be ideal if it added the two alpha values (clipping at 255 of course), but I don't think it's likely it already does that.
I don't know MMX (yet)...
Perhaps if I have multiple windows moving on a timer... will try it out one time.
Also, another question: I've just dusted off your Fast Alpha Blend algo, and I'm wondering what it does with the alpha values of the destination? For my purpose it'd be ideal if it added the two alpha values (clipping at 255 of course), but I don't think it's likely it already does that.
I don't know MMX (yet)...
MMX alpha blending. Two point at once:
; mm0=0
; mm1=b(0,r0,g0,b0,0,R0,G0,B0) - a
; mm2=b(0,r1,g1,b1,0,R1,G1,B1) - b
; mm3=w(c,c,c,c) - c[0..FF]
movq mm6,mm2
movq mm5,mm1
punpcklbw mm2,mm0 ;mm2=w(r1,b1,R1,B1)
punpckhbw mm6,mm0 ;mm6=w(0,g1,0,G1)
punpcklbw mm1,mm0 ;mm1=w(r0,b0,R0,B0)
punpckhbw mm5,mm0 ;mm5=w(0,g0,0,G0)
psubsw mm2,mm1
psubsw mm6,mm5 ;mm6:mm2=b-a
psllw mm1,8
psllw mm5,8 ;mm5:mm1'=a
pmullw mm2,mm3
pmullw mm6,mm3 ;mm6:mm2'=c*(b-a)
paddw mm1,mm2
paddw mm5,mm6 ;mm5:mm1'=a+c*(b-a)
psrlw mm1,8
psrlw mm5,8 ;mm5:mm1=a+c*(b-a)
packuswb mm5,mm1 ;mm5=b(0,r2,g2,b2,0,R2,G2,B2)
; mm5=result
Nexo - thanks but that code is even farther away from what I want to do than bitRAKE's is. What I need is:
src = argb
dst = argb
where a is the one-byte alpha value for that pixel.
dst(a) = src(a)+dst(a)
dst(r) = src(r) * src(a) + dst(r) * (255 - src(a))
(Green and blue in the same way as red)
bitRAKE's does exactly what I want, except it does something strange with the destination's alpha value.
src = argb
dst = argb
where a is the one-byte alpha value for that pixel.
dst(a) = src(a)+dst(a)
dst(r) = src(r) * src(a) + dst(r) * (255 - src(a))
(Green and blue in the same way as red)
bitRAKE's does exactly what I want, except it does something strange with the destination's alpha value.
In my algo:
dA = [ sA * (sA - dA) / 256 ] + dA
Nexo, he needs per-pixel alpha.
dA = [ sA * (sA - dA) / 256 ] + dA
Nexo, he needs per-pixel alpha.
:stupid: ? May be change comments?
bitRAKE, it is per-pixel alpha?
; mm0=0
; mm1=b(a0,r0,g0,b0,A0,R0,G0,B0) - a
; mm2=b(a1,r1,g1,b1,A1,R1,G1,B1) - b
; mm3=w(a0,a0,A0,A0) - c[0..FF]
movq mm6,mm2
movq mm5,mm1
punpcklbw mm2,mm0 ;mm2=w(r1,b1,R1,B1)
punpckhbw mm6,mm0 ;mm6=w(a1,g1,A1,G1)
punpcklbw mm1,mm0 ;mm1=w(r0,b0,R0,B0)
punpckhbw mm5,mm0 ;mm5=w(a0,g0,A0,G0)
psubsw mm2,mm1
psubsw mm6,mm5 ;mm6:mm2=b-a
psllw mm1,8
psllw mm5,8 ;mm5:mm1'=a
pmullw mm2,mm3
pmullw mm6,mm3 ;mm6:mm2'=c*(b-a)
paddw mm1,mm2
paddw mm5,mm6 ;mm5:mm1'=a+c*(b-a)
psrlw mm1,8
psrlw mm5,8 ;mm5:mm1=a+c*(b-a)
packuswb mm5,mm1 ;mm5=b(a2,r2,g2,b2,A2,R2,G2,B2)
; mm5=result
bitRAKE, it is per-pixel alpha?
Originally posted by Nexo
bitRAKE, it is per-pixel alpha?
:) It doesn't get the alpha value each pixel, but aplies what is in MM3. You could add a couple more instructions to make it per-pixel, but as it stands uses only one alpha value. My algo extracts the alpha from each source pixel to use for blend = per-pixel alpha blend.bitRAKE, it is per-pixel alpha?
bitRAKE, I was think, you add a couple instructions ;)
(a0,r0,g0,b0) - one source pixel; (A0,R0,G0,B0) - two source pixel; a0,A0 - alpha.
I think this per-pixel alpha blend :) Is not it?
Best regards, Nexo.
; mm1=b(a0,r0,g0,b0,A0,R0,G0,B0) - a
pshufw mm3,mm1,11110101b
psrlw mm3,8
; mm3=w(a0,a0,A0,A0) - c[0..FF]
My algo extracts the alpha from each source pixel to use for blend = per-pixel alpha blend.
(a0,r0,g0,b0) - one source pixel; (A0,R0,G0,B0) - two source pixel; a0,A0 - alpha.
I think this per-pixel alpha blend :) Is not it?
Best regards, Nexo.
bitRAKE, I was think, you add a couple instructions ;)
; mm1=b(a0,r0,g0,b0,A0,R0,G0,B0) - a
pshufw mm3,mm1,11110101b
psrlw mm3,8
; mm3=w(a0,a0,A0,A0) - c[0..FF]
(a0,r0,g0,b0) - one source pixel; (A0,R0,G0,B0) - two source pixel; a0,A0 - alpha.
I think this per-pixel alpha blend :) Is not it?
Best regards, Nexo.