i am doing brightness (0-100% of RGB on a 32 bit pixel like this)
where multiplier is 16bit words of brightness (0 to 255)

movq mm6,multiplier
movq mm0,[esi]
movq mm1,mm0
punpckhbw mm0,mm5 ; //mm0=w( A0 R0 G0 B0)
punpcklbw mm1,mm5 ; //mm1 =w(a0 r0 g0 B0)
pmullw mm0,mm6; ; //
pmullw mm1,mm6; //
psrlw mm0,8 ;//divide by 256
psrlw mm1,8 ;//divide by 256
packuswb mm1,mm0
movq [edi],mm1

it works fine, however i was wanting to have brightness of more than 100%, i had assumed the packuswb would be fine because it saturates when packing.. however i can't find a decent way to do the multiplcation (other than doing both pmullw and pmulhw and combining, which is rather pointless.. another way is since i want to maybe go to maximum of 400% or 800% i could 'scale' the percentage from 0..255 to 0..64..)
however is there a better way without doing that?
Posted on 2004-08-24 21:50:07 by klumsy
(multiplier) MM6 ranges from 0-7FFF

100% = 200h

movq mm6, multiplier
movq mm0,
movq mm1, mm0
punpckhbw mm0, mm5
punpcklbw mm1, mm5
psrlw mm0, 1
psrlw mm1, 1
pmulhw mm0, mm6
pmulhw mm1, mm6
packuswb mm1, mm0
movq , mm1

The scaling of RGB values is done by putting byte high within the word, and using a value of 200h for 100%. This results in no shifting after the multiply. MM6 is limited because of the multiply being signed.

With SSE2/3DNow+ unsiged high multiply can be used with a value of 100h for 100% and delete the shift instructions.

Works for up to ~6400%. :)
Posted on 2004-08-25 11:40:54 by bitRAKE