The program I'm working on heavily uses the heap functions to allocate memory, but as most pieces of memory allocated are quite small I've been thinking about a custom heap on top of the windows heap.

FYI, here's the frequency table of the allocated memory size and the weighted average (28.4 bytes).


size freq | size freq
32 59 | 71 2
28 56 | 55 2
16 50 | 49 2
68 40 | 43 2
36 21 | 39 2
5 19 | 30 2
11 17 | 24 2
10 10 | 2 2
6 9 | 19 2
12 8 | 83 1
7 7 | 80 1
17 6 | 69 1
13 6 | 65 1
9 5 | 61 1
56 5 | 59 1
20 5 | 53 1
18 5 | 52 1
15 5 | 44 1
14 5 | 40 1
8 4 | 37 1
50 4 | 31 1
48 4 | 27 1
2 4 | 25 1
41 3 | 22 1
23 3 |

weighted average:
~ 28 bytes


Because of the small sizes, I want to have as little overhead as possibe. I was thinking about allocating a relatively large block (say 4/16/64 KB). Each block gets a small header, telling the size of the block (max, free/used), and some prev & next pointers to the previous and next block.
When memory is allocated, the 8 bytes before the actual memory pointer will contain the size of the allocated piece and a pointer to the block header.
This is just one idea, I'm still thinking of improvements and other methods and I'd like to hear your suggestions or useful urls as well.

Thomas

edit: Found this url http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dngenlib/html/heap3.asp, haven't fully read it yet but seems useful.
Posted on 2002-05-24 11:02:34 by Thomas
Ahh, I have been wondering along the same lines about the heap for awile now. I am working out a program on paper and its requirements look much the same on the memory side of things. However I have not decided on how to implement yet because of such considerations.

Like if I have to constantly reallocate alot of portions to make it bigger or smaller will it eventually fragment the heap? I dont even know if the heaps fragmentable :tongue:
Or if Windows will do garbage and packing sweeps...

Hey Thomas, what about two lists?
I say two just for speed. One for used and one for free. This would also free up some of the space in the individual headers.

I dont see a need to go through a whole list looking for free spots just to find out you need a new one hehe.
Posted on 2002-05-24 12:21:56 by Graebel

Like if I have to constantly reallocate alot of portions to make it bigger or smaller will it eventually fragment the heap? I dont even know if the heaps fragmentable
Or if Windows will do garbage and packing sweeps...


Yes you need to consider these things. It all depends on the heap use characteristics of your program. If you've consistantly used a wrapper for all allocations/deallocations (like me) it's easy to create some statistics from them.

Btw, windows can't defragment the heap as moving around allocated blocks would invalidate the pointers to them.



Hey Thomas, what about two lists?
I say two just for speed. One for used and one for free. This would also free up some of the space in the individual headers.

I dont see a need to go through a whole list looking for free spots just to find out you need a new one hehe.


Very good point! Maybe these lists could start at the end of one allocated block and grow down (like the stack)? This way the total size of the lists isn't fixed and thus produces less overhead.

Another thing I thought about is how to make them thread-safe. We don't want two threads fiddling around with the lists at the same time. A simple mutex would suffice, I don't think it would influence the speed much either.

Thomas
Posted on 2002-05-24 12:36:31 by Thomas
Here's my proposal:

Heap block:


[start of heap block here]
---------------------
HEAP_HEADER header
---------------------

HEAP_ITEM heapItems[] (growing upwards)


---------------------- <--+
HEAP_LIST_ITEM usedMemory[] |
---------------------- | (growing downwards)
HEAP_LIST_ITEM freeMemory[] |
---------------------- <--+
[end of heap block here]

HEAP_HEADER
{
- DWORD: pointer to next heap block
- DWORD: pointer to prev heap block
- DWORD: mutex handle for manipulating this heap block's header or lists
- DWORD: number of items in the freeMemory array
- DWORD: number of items in the usedMemory array
- DWORD: total size of memory block (including header & lists)
- DWORD: pointer to the end of the usedMemory array
- DWORD: reserved
}

HEAP_ITEM
{
- DWORD: pointer to HEAP_HEADER of this heap block (=start of block)
- DWORD: size of this heap item, not including the first two DWORDS.
(the actual size a HEAP_ITEM struct takes is aligned to the
next 8 byte boundary, this size is the size of the bytes
that are actually used)
- x bytes: heap item data
- padding to next 8 byte memory boundary
}

HEAP_LIST_ITEM
{
- DWORD: pointer to start of used or free memory piece in this block
- DWORD: size of used or free memory block
}


The usedMemory array is immediately followed by the freeMemory array. The
end of the freeMemory array is always the end of the heap block. When an item
to one of the arrays is added, the list grows downwards in memory, no data is
written after the end of the memory block. This does require a special way of
manipulating the arrays, namely:

- to add an item to the usedMemory list:
get the offset of the first item in the usedMemory array. Add the item
*before* this item.
- to remove an item from the usedMemory list:
move the first item of the usedMemory list over the item to remove
- to add an item to the freeMemory list:
move the last item of the usedMemory array, before the first item in
the usedMemory array, thereby freeing the last item. Into this item,
copy the new item to be added to the freeMemoryList.
- to remove an item from the freeMemory list:
move the first item in the freeMemory array over the item to be deleted.
Then move the first item of the usedMemory array over the first item in
the freeMemory array.

In all cases, the heap block header has to be updated to reflect the changes.


There is some global data as well (protected by another mutex) which contains the first block pointer, number of blocks etc.

What do you think?

Thomas
Posted on 2002-05-24 13:17:14 by Thomas
Sounds good. There is not a whole lot of ways you can code something like this.


- to add an item to the usedMemory list:
get the offset of the first item in the usedMemory array. Add the item
*before* this item.


The point of the free list is to reuse memory when you can right? I would say it should be:

- to add an item to the usedMemory list:
Check the freeMemory list for an item of usable length. If found move this item to the usedMemory list and copy contents, amend freeMemory list. If not, get the offset of the first item in the usedMemory array. Add the item *before* this item.

Or something like that. Maybe thats what you were thinking and just worded it differently...
Posted on 2002-05-24 13:43:49 by Graebel

Check the freeMemory list for an item of usable length. If found move this item to the usedMemory list and copy contents, amend freeMemory list. If not, get the offset of the first item in the usedMemory array. Add the item *before* this item.


If no entry from the freeMemory list can be used, the wanted memory does not fit in the current block and you'll have to try in the next block. If none of the blocks provide the required memory, a new block has to be allocated.

Thomas
Posted on 2002-05-24 13:47:32 by Thomas
I think we can't use a mutex for each heap block but a global mutex is required. When one thread tries to add a heap block to the linked heap block list, another thread might be walking through list list and miss the new block.
Do you think HeapAlloc can be called by two threads at the same time (on the same heap handle), i.e. does one of the calls have to wait for the other to finish?

Thomas
Posted on 2002-05-24 14:18:44 by Thomas
Ahah... found this


HeapWalk can fail in a multithreaded application if the heap is not locked during the heap enumeration. Use the HeapLock
and HeapUnlock functions to control heap locking during heap enumeration.


While I do not think you need to enumerate the heap, HeapLock and HeapUnlock should help eliminate problems with multiple threads

:alright:
Posted on 2002-05-24 14:42:40 by Graebel
Just quickly overlooking this thread, thought I might put in my two cents. I think the simplest approach to implement would be something like a FAT system:

1) Grab 4kb of memory
2) break it into 128 blocks 32 bytes a piece
3) map these into a 16 byte bit mask

Simple functions:

1)To allocate a block check you 16 bytes for a free bit. This would be very easy if you compare against -1.
2) To free a blcok of memory, just set the bit to zero

Should be easy enough to write an allocation function that grabs a couple of consecutive blocks. Also, defragging would be simple (if you need it). Finally, all data would be aligned to 32 bytes.
True, it's a bit wasteful, but I think it would be very fast for your purposes

--Chorus
Posted on 2002-05-24 15:23:20 by chorus
chorus: That's a nice suggestion as well, not much overhead...

Graebel: the problem is not with the heap functions. It's the heap blocks in my custom heap. These blocks are linked together. However when you are allocating memory these blocks have to be scanned until a block is found that has enough free memory. If not, a new block has to be allocated.

Say thread 1 has to allocate a new block. This also means it is changing properties of existing blocks (prev or next pointer). Therefore, thread 2 cannot scan through the existing blocks when thread 1 is modifying them. In general, only one thread may modify block data (i.e. headers, lists, not the actual data of course) at a time. So I think a global mutex is the only solution.

I asked about HeapAlloc because I wondered if the windows heap functions have the same limitation.

Thomas
Posted on 2002-05-24 15:31:32 by Thomas
Arrghh the answer was right in front of me :grin:
docs on HeapAlloc, second parameter:

HEAP_NO_SERIALIZE Specifies that mutual exclusion will not be used while the HeapAlloc function is accessing the heap.
This value should not be specified when accessing the process heap. The system may create additional threads within the application's process, such as a CTRL+C handler, that simultaneously access the process heap.


So the heap functions do have the same limitation :) Well at least for the default process heap.

Thomas
Posted on 2002-05-24 15:33:22 by Thomas