Hey guys..

I'm looking to implement an ASM-based memory management system that supports dynamic variable sizes at runtime, etc. for a compiler I am trying to write (at the code generation phase).. Basically what I want to do is pretty standard based on a heap of X bytes that has the following structure:

struc mem_block
{
.factive db
.fdatasize dw
.prev dw
.next dw
.data ; Not sure what to do here with this
.reserved dw
.bdatasize dw
.bactive db

}

In my design this struct is a node of a double linked list that will store all of the variables. I would need to add nodes to the linked list as required using something like this:

Check for an existing free mem_block
If no mem_block exists that is free and big enough, create a new one and link it to the list
If a mem_block is free (and big enough), check the size of the mem_block against the needed
storage space
If a free mem_block is found but is signficantly larger than the needed space, break that
mem_block into two mem_blocks -- one to store the data in and the other will be flagged as
free for the next time storage is needed

I would also desire to implement some sort of garbage cleanup where two adjancent free mem_blocks can be combined into a single larger mem_block. I believe this is probably the best way to accomplish a dynamic memory management system.

I hope I've explained what I'd like to do, and the reason I am posting this here is because I have a few questions:


1) Is this concept feasible using FASM (or for that matter, assembler in general)?
2) Are there any ways that FASM can make my life signficantly easier in trying to create this memory manager?
3) Is there perhaps an easier way to accomplish this task in ASM (such as perhaps a C library I could write)

If anyone could take a shot at these questions I'd appreciate it. I know FASM supports marcos and strucs, however I haven't seen enough FASM example code or tutorials to actually understand how they work enough to make use of them. Also, if at all possible I would like the memory manager to be portable between Win32 and Linux systems (which makes using win32 function calls not an option)..

I know this is a lot of questions and it's not a trivial project I'm trying to do. I guess I'm just hoping some of you ASM gurus out there can help a HLL junkie learn the ropes. :) I'd appreciate any feedback you guys may provide!
Posted on 2003-02-26 06:26:22 by coogle
I am porting to FASM a double linked list I've coded some time ago (In VB, shame on me :rolleyes: ) that I am using as data-structure for an engineering program. I think that the code will serve for your purposes. There are two differences: I use dwords instead of words and I use a structured list (fathers and childs)
But for a dynamic memory management with garbage collector, I think you do best with a different data structure. let me look at Knuth's...

... for the double linked list, I'll send you the source as soon as the port is 'workable'

And yes, code it in FASM :alright:
Posted on 2003-02-26 06:57:56 by pelaillo
:( Sorry coogle, I don't know why the gurus seems to be not interested on this subject. I think your implementation will be a good contribution to the community.

Regarding your questions, IMHO there will be not easy at all. But in the other hand, it will not be harder than doing it in C. (besides the fact you use an already done library ;) ) I am doing it in FASM because I need it for other purposes.

Regarding garbage collector methods and possible alternatives, I find some info that I hope you will find interesting:

The conclusion presented in TAOCP is that as garbage collector takes large chunks of computer time so it should be used as "last resort" method. The garbage collector tend to leave the memory broken up. Requires a strong discipline with pointers and it is difficult to achieve.

The following is from other sources as well:
"Empirical studies from around 10 years ago showed that conservative
garbage collectors had comparable performance to manual memory management --
for some applications GC was faster, for some slower, but on average the
same -- and garbage collectors have improved a lot since then. Manual
memory management can also take over your program at unexpected times; have
you ever looked at the amount of work malloc and free have to do to avoid
heap fragmentation, or how reference counting causes poor locality of
reference and thereby lots of cache misses?" (N. Pryce)

"Autonomous Garbage Collection" (B. Willard, O. Frieder) says:

"Memory management defects related to dynamic storage allocation account for
some of the most problematic and complex defects in existence today. It is not
uncommon for long running information server applications to be plagued with memory
leaks. This is especially true for large-scale information processing systems, developed
with a programming language that employs dynamic memory allocation under the
paradigm that places the responsibility on programmers to explicitly deallocate
dynamically allocated storage after it is no longer in use by the program.
...
Garbage collectors are effective because they are capable of bounding the amount
of storage lost due to memory leaks. As long as the amount of lost storage is bounded,
memory and its associated page files can be sized to accommodate the usage of the
application. Thus, our research has targeted a solution that bounds memory storage loss
due to memory leaks. In addition, we targeted a solution that addresses such problems in
fielded systems. Traditionally, all garbage collectors, including conservative garbage
collectors, require some level of programmer assistance. We describe an autonomous
garbage collection method that meets the research objective: detect and resolve memory
leaks without programmer, compiler or linker assistance.
...
To accomplish autonomous garbage collection, the following information must be
automatically acquired from within the target application:
? location of heap storage
? location (starting address and size) of all allocated heap segments
? location of the program stacks
? location of writable program memory"

And a large source of info about this in http://citeseer.nj.nec.com/wilson92uniprocessor.html

I will continue collecting ideas in order to achieve useful and up-to-date pieces of code.

Keep on touch,
Posted on 2003-03-04 10:00:55 by pelaillo
I'm no guru, but I am interested in this, in fact I'm right in the middle of writing my own.

I have for a long time wondered about the best method of maintaining dynamic arrays of same sized objects. Usually small objects.

Are the built in windows routines good enough though? Heaps, and espicially the new Low-fragmentation Heap seem pretty good. I only wonder about the efficiancy of Microsofts code.
Posted on 2003-03-04 10:16:43 by Eóin
Well, I've implemented my memory manager... I feel it might be a hack, because I'm on a time schedule and don't have time to write my own malloc/free/realloc... So what I've done is used libc's malloc/free/realloc and basically implemented a smart pointer system.

In the end I defined a structure like this:


DWORD ptr
WORD refcount
WORD reserved
DWORD next_ptr

Where ptr represented the actual memory where the data was located, refcount is the number of references to the memory in ptr and next_ptr is a pointer to the next block. I've abstracted everything to the point of three calls:

pasm_alloc, size - returns a PASM pointer to a memory location of desired size
pasm_free, pptr - frees the memory at PASM pointer pptr
pasm_realloc, pptr, size - reallocates the size of the memory @ PASM pointer to pptr
pasm_getptr, pptr - Resolves a PASM pointer to an actual pointer in memory.

PASM pointers are essentially pointers to the start of a block.

In terms of garbage collection, any time pasm_free is called it does the following:

attempts to find the block pointed to by the PASM pointer.. if it can't find it causes an error
if it finds it, it frees the memory that the block is pointing to as well as the memory for the block itself and the list is re-linked.

I think this system will be a fairly good system for my needs. I don't expect to see a great deal more internal or external fragmentation than with any other memory management system. From everything I can tell so far because I create a new node and then create the data for that node it's usally right next to each other in memory. That means when I free a variable I'm freeing VARSIZE + BLOCKSIZE and that keeps the fragmentation fairly low I think. Of course with any memory manager you could intentionally cause fragmentation by allocating 10000 small variables and then trying to allocate 10000 large variables, but even then the fragmentation will be minimized since the "blocks" which point to those variables will still be allocated where the others were freed.

John
Posted on 2003-03-08 13:19:12 by coogle
>
Originally posted by coogle
>Well, I've implemented my memory manager... I feel it might be a hack, because >I'm on a time schedule and don't have time to write my own malloc/free/realloc...
Perhaps this link could help you:
http://www.programmersheaven.com/c/MsgBoard/read.asp?Board=233&MsgID=115642&Setting=A9999F0003
Posted on 2003-03-28 11:58:44 by octavio
Indeed it does... thank you much :)

John
Posted on 2003-03-28 13:18:39 by coogle
using a simple linked list for memory allocation will probably end up worse than straight malloc/heapalloc/whatever calls - however it's a good excercise to write your own allocation schemes, and nobody says you should stop at the "na?ve" linked list implementation; messing with memory allocation is fun.

You already wrote pasm_* memory management functions, this is good; it will allow you to change the internal workings without modifying your applications.

There's no way to acheive 100% platform independance, you will have to use platform dependant memory allocation primitives, unless you want to design a system built ontop of libc malloc() - and then you're dependant on the libc "platform".
Posted on 2003-03-31 03:08:58 by f0dder
Actually, you can get platform-independence if you're willing to preallocate the "private heap" space. The down side is needing to set the maximum size (at assembly time).
Posted on 2003-04-01 15:55:54 by tenkey
that would be rather ugly for PCs. Might be an option if you're doing console or embedded development, though.
Posted on 2003-04-02 08:22:02 by f0dder
Private heap space is used plenty in PC games.
Posted on 2003-04-02 08:59:12 by ThoughtCriminal
I doubt they "Preallocate" it, though. The way I understood tenkey, anyway, which would be to declare some largeish static array and implement a heap manager ontop of that.
Posted on 2003-04-03 00:34:45 by f0dder
So What I'm hearing is that the best method to making a cross-platform memory management system would be to basically allocate a large block of memory statically and then write memory management functions such as malloc() which allocate chucks of that larger portion?

John
Posted on 2003-04-03 00:38:48 by coogle

...allocate a large block of memory statically...

It's the only method that would work without depending on the operating system, but it is in no way a good generic approach. You still need to recompile anyway, so it's much better to have a compile-time if statement that chooses the underlying block allocator. libc malloc for generic systems (and linux, unless you're going to be dirty and do syscalls), heapalloc for win32, <whatever> for <whatever>, and static allocation if you have some quirky system.
Posted on 2003-04-03 00:47:55 by f0dder
Thanks for the tip... I like that approach of deciding what type of memory allocation will be done at compile time, and since I've abstracted my current malloc() calls it'd be a easy task to plug-in-pray new memory allocation routines in.

What would be REALLY cool is if FASM could determine the OS at assemble-time so I could do:


if OS equ "Linux"

else if OS equ "Win32"

else if..


But can't have it all :)

John
Posted on 2003-04-03 00:51:45 by coogle
if it doesn't already have that facility, it should be very easy to add. And until then, you could set those defines at build-time. Code on :)
Posted on 2003-04-03 00:55:11 by f0dder