The color counting scheme I am currently using is way too slow for large images so I came up with a new way to do it.

1: Create an array of 256 DWORDs and fill with 0
2: Each time a color is processed use the low order byte to index the secondary array
3: If no secondary array is found allocate one on the heap, consisting of the second byte + pointer to tertiary array
3a: Create a tertiary array and put in the color + count (1)
4: If the secondary array is found then scan it for the second byte
5: If the second byte is not found then add a new tertiary array and put in the color + count (1)
6: If the secondary array is found then scan the tertiary for our color
7: If it's found then increment the global count, the color count and return
8: If not found then add the color to the tertiary array

Problem is that I can't get it working. It will work fine for a while then just stop. I have allocated the heap as follows:

invoke HeapCreate,0,0,0

Do I have to make it moveable ? Is there something else I am doing wrong here ?

Sorry about the length of the code...
SearchTable FRAME hColorHeap,pArray,color,pCount

uses edi,esi,ebx
LOCAL tv :D
; Get the red color
movzx ebx,B[color]
mov esi,[color]
mov edi,[pArray]

; Get the offset to the appropriate node
mov eax,[edi+ebx*4]
or eax,eax
jz >>.NEW_NODE
; Scan the second level table for our index
mov edi,eax
invoke HeapSize,[hColorHeap],NULL,edi
mov ebx,eax
movzx edx,W[color]
shr edx,8
xor eax,eax
L0:
mov ecx,[edi+eax] ; [color=red][b]< EXCEPTION HERE[/b][/color]
cmp ecx,edx
je >>.NODE_FOUND
add eax,8
cmp eax,ebx
jl <L0

.NODE_NOT_FOUND
; There is no existing node for the color
mov eax,ebx
add eax,8
invoke HeapReAlloc,[hColorHeap],HEAP_ZERO_MEMORY,edi,eax
or eax,eax
jnz >L1
PrintError
PrintText("Why is this zero")
L1:
mov edi,eax
mov ecx,[pArray]
movzx edx,B[color]
mov [ecx+edx*4],edi

; move the index into the array
mov [edi+ebx],edx ;;;;;

; add the first item
invoke HeapAlloc,[hColorHeap],HEAP_ZERO_MEMORY,8
; move the address of the first entry into the node
mov D[edi+ebx+4],eax

; Set the item
mov [eax],esi
mov D[eax+4],1

mov eax,[pCount]
or eax,eax
jz >L2
inc D[eax]
L2:
mov eax,1
ret

.NODE_FOUND
/*
We have found the second byte of the color in the
master list. We now have to see if our color is in
the list.
On entry the color list base address is found at [edi+eax+4]
*/

; First we have to find out how many colors are in the array
mov [tv],edi
mov edi,[edi+eax+4]
invoke HeapSize,[hColorHeap],NULL,edi
mov ebx,eax
; Scan the node for our color
L3:
mov ecx,[edi+eax]
cmp ecx,esi
je >>.FOUND
add eax,8
cmp eax,ebx
jl <L3

; The color wasn't found so expand the node and add it
lea eax,[ebx+8]
invoke HeapReAlloc,[hColorHeap],HEAP_ZERO_MEMORY,edi,eax
mov edi,eax

; Update the node array with the new address
mov ecx,[tv]
mov [ecx+4],edi

; Add the new color
mov [edi+ebx],esi
mov D[edi+ebx+4],1

; increment the count if necessary
mov eax,[pCount]
or eax,eax
jz >L4
inc D[eax]
L4:
mov eax,1
ret


.NEW_NODE
/*
ADD NEW NODE
on entry [edi+ebx*4] points to this base
Allocate 8 bytes for the new node
[Node] = second byte of color
[Node+4] = Address of 3rd level array
*/

; Allocate a node
invoke HeapAlloc,[hColorHeap],HEAP_ZERO_MEMORY,8
mov [edi+ebx*4],eax

; Point EDI to our new node
mov edi,eax

; Add our first item
; Store the second byte of the color in the first DWORD
movzx eax,W[color]
shr eax,8
mov [edi],eax

; Create a color value list
invoke HeapAlloc,[hColorHeap],HEAP_ZERO_MEMORY,8
; Store the address of the list in the second DWORD
mov [edi+4],eax

; Fill in the color information
mov ecx,[color]
mov [eax],ecx

; Set the initial count to 1
mov D[eax+4],1

; If not a NULL pointer, increment the totals count
mov eax,[pCount]
or eax,eax
jz >L5
inc D[eax]
L5:
mov eax,1
ret

.FOUND
; edi+eax contains the pointer to the node item
mov edx,[edi+eax+4]
inc edx
mov [edi+eax+4],edx
mov eax,edx
ret

ENDF

=========================================
Exception code: EXCEPTION_ACCESS_VIOLATION
Instruction pointer: 00401831h

eax=00001060 ebx=00001060 ecx=00020208 edx=01250000 esi=00D6E4F7
edi=01252FA0 ebp=0022F4FC esp=0022F4F8 eip=00401831
CS=001B DS=0023 SS=0023 ES=0023 FS=0038 GS=0000 O D i S Z A p C
----------------------------------------------------------------
Posted on 2004-08-11 12:12:57 by donkey
How about using a two-level hierarchy? Either a 256 pointers to 64k-dword arrays, or 64k pointers to 256b-dword arrays?

Or perhaps use a hash table? You wouldn't have to use linked lists per hash entry, as you know the max indexes you'll ever get (2^24) - although allocating the hash index arrays should of course be done dynamically, otherwise you might as well do a 64meg table for color counting :)
Posted on 2004-08-11 13:03:16 by f0dder
Hi f0dder,

The problem is to try to get some better speed than from the brute force type approach. I have a single level right now but if you consider a 1000*1000 image (not unreasonable) it contains 1 Million of pixels. That's 1 million searches, in a photo there are generally 10,000 to 100,000 unique colors, worst case could be billions of compares. The Griffon helicopter photo takes approx 1/2 to process with a single level search, and it rises exponentially until you hit around 5 sec for a 1280*1024. That was the reason for the new algoritm, creating huge (64K) static arrays is out of the question, an average photo would be in the mulitmegabytes, I originally tried a linked list but it GPF'ed quicker than this method. I keep getting problems with the HeapReAlloc function, it will work fine for 700-800 colors then return 0.
Posted on 2004-08-11 13:21:23 by donkey
I assume by "searching for color" you mean searching an entire linked list... yep, this would be slow.

Now, the max memory you would use for counting colors would be (2^24)*4 = 64meg - this is too much, I agree.

Using a 64k entry table would require 256kb for the main table, and 256 bytes for each additional table (up to 256kb + 64meg if you're really unlucky about the color distribution).

Btw it might be beneficial to convert the RGB data to some other color model - I might be wrong about this, but I think the RGB values generally vary a lot, whereas if you convert to some "light level" based scheme, it will probably be easier to index.

Can you post a jpg or png of the helicopter? I'd like to play a bit :) - btw, what's your system specs?
Posted on 2004-08-11 13:50:57 by f0dder
System specs are 512MB memory, 134MB used, PIII 650 mobile.

BTW here is the semi-brute force method,it works fine (or appears to) but is slow, by slow I mean a 961 x 637 image with 128377 unique colors takes 2 seconds to count. The griffon (helicopter) image is 1.75 MB zipped so it's a bit too much to post and it is pretty much moot unless you have the exact same image I have.
SearchTable FRAME hHeap,pArray,color,pCount

uses edi,esi,ebx

; pArray points to an array of 256 QWORDS

mov ebx,[color]
mov esi,ebx
and ebx,0ffh
mov edi,[pArray]

mov eax,[edi+ebx*8]
or eax,eax
jz >>.NEWNODE
; scan for our color
mov ebx,[edi+ebx*8+4]
shl ebx,3
mov edi,eax
mov eax,ebx
sub eax,8
:
mov ecx,[edi+eax]
cmp ecx,esi
je >>.FOUND
sub eax,8
jns <

.NEWCOLOR
mov eax,ebx
add eax,8
invoke HeapReAlloc,[hHeap],0,edi,eax

; add the color
mov [eax+ebx],esi
mov D[eax+ebx+4],1

; update the pointer in the master list
mov ecx,esi
and ecx,0ffh
mov edi,[pArray]
mov [edi+ecx*8],eax
inc D[edi+ecx*8+4]

mov eax,[pCount]
or eax,eax
jz >
inc D[eax]
:
mov eax,1
ret

.NEWNODE
invoke HeapAlloc,[hHeap],0,8
mov [edi+ebx*8],eax
mov D[edi+ebx*8+4],1

mov [eax],esi
mov D[eax+4],1

mov eax,[pCount]
or eax,eax
jz >
inc D[eax]
:
mov eax,1
ret

.FOUND
inc D[edi+eax+4]
mov eax,[edi+eax+4]
ret

ENDF
Posted on 2004-08-11 14:18:56 by donkey
I'm playing around a bit now. My "2-level idea" certainly sucks, all of R,G,B are too spread so they're not suitable as index values. Might help using a different color model, but haven't tested.

A C++ std::map structure didn't really speed up stuff - doesn't seem worse either, though. So perhaps a dedicated tree structure would be a bit better than the bruteforce two-level approach.

Going to test a hash table approach shortly...
Posted on 2004-08-11 15:24:48 by f0dder
Just allocate 64MB. :)

@@:
mov eax,
add esi, 4
inc
dec ecx
jne @B

(Unroll the loop maybe?)
Then post process that array for a list of colors/counts.
Posted on 2004-08-11 16:00:24 by bitRAKE
:grin: Thanks bitRAKE, but I think a 64MB buffer is a bit much

Thanks f0dder, I wouldn't be too bothered normally, for TBPaint the semi-brute force algo can count an image strip in worst case 0.02 seconds so for my purposes it was and still is fine. But I want it in my graphics.lib and I'm not satisfied with the performance on photo quality images. This would not be too bad but it is used for several functions, calculating the best color table for an image when reducing the color count, reordering the color table, the histogram function, and the color count itself. So it is a fairly critical function and speed is important when it's in the lib. For now I am going to put in the brute force algorithm as I essentially only have until Saturday morning and then I'll be gone, I want to get this lib useful before that.
Posted on 2004-08-11 16:15:37 by donkey
Donkey,

This thread got me to thinking about color counting and I have an idea that I'm playing with right now. Let me see how it turns out before really go into any depth on it. But basically, it is a 2d array (for red, green) and a 256 byte allocation for the blues on each of these (assuming one blue is found for this reg-green pair).

Once again, let me play with it more. But I think it'll be fast enough for ya. I'll provide a C function that I'm doing my testing and also use the vc++ toolkit and provide you with an ASM output if this works out well.

I'll get back with ya on this soon.

Relvinian
Posted on 2004-08-11 16:38:00 by Relvinian
Hi Relvinian,

Sounds good, sort of a modified version of mine but with a static size at the end. I would have to see how big the array could grow in a real image though. I need to be able to extract the following information...

1: The total number of colors
2: The number of instances the first BYTE appears
3: A complete list of unique colors with their individual counts

Big order but it has to be capable of doing all the things that the brute force algo does as it is a worker function called by alot of other functions. #2 is optional as there are alot of very fast ways to do that.
Posted on 2004-08-11 16:48:54 by donkey

Hi Relvinian,

Sounds good, sort of a modified version of mine but with a static size at the end. I would have to see how big the array could grow in a real image though. I need to be able to extract the following information...

1: The total number of colors
2: The number of instances the first BYTE appears
3: A complete list of unique colors with their individual counts

Big order but it has to be capable of doing all the things that the brute force algo does as it is a worker function called by alot of other functions. #2 is optional as there are alot of very fast ways to do that.


Donkey,

Well this is the source file in C. It is pretty fast. It'll return the number of colors and can be easily adapted for item #3. Option #2 would be a little bit of extra work but still doable.

Anyway, here is the code. The pImageData param assumes the raw pixel data in 32-bit format. CX, CY are the width and height of the image to process. Let me know how this function works for your tests.

If you want the ASM version, let me know and I'll post it.



int CountColors(LPVOID pImageData, const int CX, const int CY)
{
BYTE* Array[256][256] = {0};
DWORD* pColor;

int x, y, i;
int nCount;

// start with the first pixel
pColor = (DWORD*)pImageData;

// Step through each of the scan lines of the image
for (y = 0; y < CY; ++y)
{
for (x = 0; x < CX; ++x, ++pColor)
{
BYTE Red, Green, Blue;
BYTE* pBlueArray;

// get the 'r' value from the pixel
Red = (BYTE)(*pColor >> 16);

// get the 'g' value from the pixel
Green = (BYTE)(*pColor >> 8);

// get the 'b' value from the pixel
Blue = (BYTE)(*pColor);

// see if we have already allocated an array of Blue bytes
if (Array[Red][Green] == NULL)
Array[Red][Green] = (BYTE*)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, 256);

// set the blue byte indicator to true
pBlueArray = Array[Red][Green];
pBlueArray[Blue] = TRUE;
}
}

// now that we have the colors marked, we need to count the flags set
nCount = 0;
for (x = 0; x < 256; ++x)
{
for (y = 0; y < 256; ++y)
{
if (Array[x][y])
{
// some blues are marked, walk the count and add up
BYTE* pBlues = Array[x][y];
for (i = 0; i < 256; ++i)
{
if (pBlues[i])
++nCount;
}

// free allocation
HeapFree(GetProcessHeap(), 0, pBlues);
}
}
}

return nCount;
}


Relvinian
Posted on 2004-08-11 17:47:54 by Relvinian
Hi Relvinian,

Yes the ASM version would be great, I'm extremely lame at any language but ASM, can't even figure out the C compiler by myself. Option 2 is not at all important, it was used in the histogram function but I have thought of a better histogram algorithm and already implemented it, it is faster than can be done with any type of color counter, it scans huge images in the blink of an eye. My needs are now just total colors and instances per RGB color.
Posted on 2004-08-11 17:56:51 by donkey

Hi Relvinian,

Yes the ASM version would be great, I'm extremely lame at any language but ASM, can't even figure out the C compiler by myself. Option 2 is not at all important, it was used in the histogram function but I have thought of a better histogram algorithm and already implemented it, it is faster than can be done with any type of color counter, it scans huge images in the blink of an eye. My needs are now just total colors and instances per RGB color.


Here's the RAW asm file from that C function. I'll let you convert it to GoAsm format. :) If you want me to put it into Masm format with a epilogue and prologue code, let me know. Also, the array is on the stack and may want to be moved to the heap.

Anyway, let me know if you want to me "polish" it up!



_TEXT SEGMENT
tv508 = -262152 ; size = 4
tv172 = -262152 ; size = 4
tv213 = -262148 ; size = 4
_Array$ = -262144 ; size = 262144
_pImageData$ = 8 ; size = 4
_CX$ = 12 ; size = 4
_CY$ = 16 ; size = 4
_CountColors@12 PROC NEAR ; COMDAT
; File d:source estjunk.c
push ebp
mov ebp, esp
and esp, -8 ; fffffff8H
mov eax, 262156 ; 0004000cH
call __chkstk
push ebx
push esi
push edi
xor eax, eax
mov ecx, 65535 ; 0000ffffH
lea edi, DWORD PTR _Array$[esp+262172]
mov DWORD PTR _Array$[esp+262168], 0
rep stosd
mov eax, DWORD PTR _CY$[ebp]
test eax, eax
mov edi, DWORD PTR _pImageData$[ebp]
jle SHORT $L74058
mov DWORD PTR tv172[esp+262168], eax
$L74056:
mov eax, DWORD PTR _CX$[ebp]
test eax, eax
jle SHORT $L74057
mov DWORD PTR tv213[esp+262168], eax
$L74059:
mov ecx, DWORD PTR [edi]
mov bl, BYTE PTR [edi]
mov eax, ecx
shr eax, 16 ; 00000010H
shr ecx, 8
movzx esi, al
shl esi, 8
movzx eax, cl
add esi, eax
cmp DWORD PTR _Array$[esp+esi*4+262168], 0
lea esi, DWORD PTR _Array$[esp+esi*4+262168]
jne SHORT $L74070
push 256 ; 00000100H
push 8
call DWORD PTR __imp__GetProcessHeap@0
push eax
call DWORD PTR __imp__HeapAlloc@12
mov DWORD PTR [esi], eax
$L74070:
mov edx, DWORD PTR [esi]
movzx ecx, bl
add edi, 4
sub DWORD PTR tv213[esp+262168], 1
mov BYTE PTR [ecx+edx], 1
jne SHORT $L74059
$L74057:
sub DWORD PTR tv172[esp+262168], 1
jne SHORT $L74056
$L74058:
xor esi, esi
lea eax, DWORD PTR _Array$[esp+262168]
mov DWORD PTR tv508[esp+262168], 256 ; 00000100H
$L74072:
mov edi, eax
mov ebx, 256 ; 00000100H
$L74075:
mov edx, DWORD PTR [edi]
test edx, edx
je SHORT $L74076
lea eax, DWORD PTR [edx+1]
mov ecx, 64 ; 00000040H
$L74080:
cmp BYTE PTR [eax-1], 0
je SHORT $L74081
add esi, 1
$L74081:
cmp BYTE PTR [eax], 0
je SHORT $L74111
add esi, 1
$L74111:
cmp BYTE PTR [eax+1], 0
je SHORT $L74112
add esi, 1
$L74112:
cmp BYTE PTR [eax+2], 0
je SHORT $L74113
add esi, 1
$L74113:
add eax, 4
sub ecx, 1
jne SHORT $L74080
push edx
push 0
call DWORD PTR __imp__GetProcessHeap@0
push eax
call DWORD PTR __imp__HeapFree@12
$L74076:
add edi, 4
sub ebx, 1
jne SHORT $L74075
sub DWORD PTR tv508[esp+262168], 1
mov eax, edi
jne SHORT $L74072
pop edi
mov eax, esi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret 12 ; 0000000cH
_CountColors@12 ENDP
_TEXT ENDS
Posted on 2004-08-11 18:06:03 by Relvinian

Hi Relvinian,

Yes the ASM version would be great, I'm extremely lame at any language but ASM, can't even figure out the C compiler by myself. Option 2 is not at all important, it was used in the histogram function but I have thought of a better histogram algorithm and already implemented it, it is faster than can be done with any type of color counter, it scans huge images in the blink of an eye. My needs are now just total colors and instances per RGB color.


Donkey,

I did some quick tests with images on this C code. All tests were from ranges of 30x30 to 1024x768. The times are from 13ms to 88ms on a P4.
Posted on 2004-08-11 18:08:32 by Relvinian
Hi Relvinian,

Thanks !! I can translate it myself, the labels are all there so it's pretty easy to do mostly just search and replace stuff.
Posted on 2004-08-11 18:10:23 by donkey
Hi Relvinian,

Looking through the listing file I think it will be easier to directly translate the C source. I'll get to it tonight. BTW you're obviously a C programmer first, nobody whose primary language was assembly would use CX as a parameter :grin:
Posted on 2004-08-11 18:26:40 by donkey

Hi Relvinian,

Looking through the listing file I think it will be easier to directly translate the C source. I'll get to it tonight. BTW you're obviously a C programmer first, nobody whose primary language was assembly would use CX as a parameter :grin:


Yep, I started programming a long time ago. First was BASIC on the Atari 6502 chip set (back in 1980), then moved to M65 (assembly for the 6502). In 1992 is when I started programming professionally as a career in C. Moved to C++ in 1995. Just started programming in assembly for the x86 in March of this year. So ASM still takes me a little bit of time to do but I'm learning at a fast rate.

I'll go ahead and convert my C function to ASM for ya too. I'll be in Masm format but if I remember correctly, there's a MASM->GoAsm converter that does a good job.

Relvinian
Posted on 2004-08-11 18:37:04 by Relvinian
Hi,

This is my translation of the function with a few changes to remove some unnecessary loops as I use only 32 bit DIB sections internally. Note that in raw format the RGB components are reversed. It is very very fast and simple. Thanks :alright:
CountColors FRAME hImage

LOCAL pArray :D
LOCAL hHeap :D
LOCAL nCount :D
LOCAL cbImage :D
LOCAL dibs :BITMAP

invoke GlobalAlloc,GMEM_ZEROINIT,65536*4
mov [pArray],eax

invoke HeapCreate,0,0,0
mov [hHeap],eax

; Blue*Green = EAX
; Red = ECX

; offset Array = EDI
; pColor = EBX

invoke GetObject,[hImage],SIZEOF BITMAP,offset dibs
mov eax,[dibs.bmWidthBytes]
mul D[dibs.bmHeight]
mov [cbImage],eax

mov edi,[pArray]
mov ebx,[dibs.bmBits]
add [cbImage],ebx

.MARKCOLORS
mov ecx,[ebx]
mov eax,ecx
shr ecx,16
and ecx,0ffh
and eax,0ffffh
mov esi,[edi+eax*4]
or esi,esi
jnz >
push eax,ecx
invoke HeapAlloc,[hHeap],HEAP_ZERO_MEMORY,256
mov esi,eax
pop ecx,eax
mov [edi+eax*4],esi
:
mov B[esi+ecx],0FFh
add ebx,4
cmp ebx,[cbImage]
jl <.MARKCOLORS

mov D[nCount],0
mov ebx,65535
L1:
mov eax,[edi+ebx*4]
or eax,eax
jz >.NOARRAY
mov edx,255
.COUNTMARKS
mov cl,[eax+edx]
add cl,cl
adc D[nCount],0
dec edx
jns <.COUNTMARKS
.NOARRAY
dec ebx
jns <L1

invoke GlobalFree,[pArray]
invoke HeapDestroy,[hHeap]
mov eax,[nCount]
ret
ENDF
Posted on 2004-08-11 19:59:17 by donkey
Looks good there Donkey.

I'm sure there's plenty of room for some basic optimization in my algo of course...:)

Anyway, glad you found the code to be some what worthwhile.

Relvinian

PS - How are the performance tests? You said it was nice and fast. Acceptable for your LIB stuff now?
Posted on 2004-08-11 20:42:06 by Relvinian
Yup, I did a few simple things like removing the MUL for Blur*Green and just using the WORD instead as well as setting the Red flag to 0FFh so I could use the carry flag instead of a JZ to increment the count. All in all it tests at almost exactly 10x faster than my brute force approach so I have added it to my lib, with proper credit ofcourse. As well, there is a histogram function and an AdjustLuma function.
Posted on 2004-08-11 21:01:34 by donkey