I'm developing a LISP-based language/LISP dialect to test the following thesis:
"By making bytecode/machinecode instructions imbeddable in LISP S-expressions/lists, the entire LISP language's primitives may be implemented as macrofunctions that return such an S-Expression/list of bytecode/machinecode.  Further, compilation optimizations may be performed by manipulating an S-expression/list composed of bytecode/machinecode."

Currently I'm trying to struggle with garbage collection, and I think the algo I've thought up sucks (wanting to support multiple threads without being able to view their registers is rather bad, since this means I can't safely throw away anything the current thread didn't create).  I can probably live easier if I didn't need to support multiple threads, but I really really really want to try and go for multiple threads.  It's partly generational, bit of a mark-and-sweep.  I'm wondering about mark-and-compact but it would probably be difficult without using handles, or using some kind of rather slowish scheme to figure out where a memory address now is.

As to the "bytecode", I'm currently developing a bastardized version of intel assembly language, one nearer to the way it's actually encoded rather than to the intel-specified syntax.  Guess what?  Most instructions have what I call a "modreg"; it's what specifies an eax,edx or a [4],eax (intel labels them as "mod | reg | r/m" according to the fields within the byte; hence I call them modreg).  A few instructions have what I call a "modmem", which only involves a memory address expression, such as label or label[4]; instructions such as mov foo,4 use modmems.  Some of the operand's bits, too (3, to be specific) actually overflow into the modmem (some of intel's instructions consistently drop the reg field of modregs, so I now have a new class of bytecode, modmem, that I'm handling separately from modreg).

I'm developing my own "bytecode" because I'm too lazy to go all the way to writing a "real" assembler.  Besides, it strikes me as being ridiculously difficult to figure out how to encode label as a LISP S-expression.

So what do I want?  Nothing much, just a couple comments and a bit of a place to talk about this idea. woot woot.

This is of course solely in my nonexistent spare time, between courting a girl who lives a couple hundred miles from where I happen to live (on a different island, at that), keeping my boss off my neck (my performance is dropping because of some.... personal emotional problems, mostly related to a previous girl who wasn't my girlfriend), writing bits of that novel I'm writing (about the Golden Lily operation, pre-Hispanic Philippines, and modern microelectronics), and trying to get enough sleep, food, and exercise to keep myself out of the hospital (it's 1251AM where I am now, I forgot to eat lunch yesterday because I was working, and my body still aches from our basketball round yesterday afternoon).  Gosh, I can't even update myself on what's happening to Bogdan's HE.  Sorry Bogdan, what's up with HE?

Anyway, any comments, suggestions, criticism, flames, etc. are welcome.  Anything but silence.

Posted on 2005-05-08 12:06:06 by AmkG
:shock:

"on a different island"

!!!

goodluck
Posted on 2005-05-09 10:30:17 by HeLLoWorld

Currently I'm trying to struggle with garbage collection, and I think the algo I've thought up sucks (wanting to support multiple threads without being able to view their registers is rather bad, since this means I can't safely throw away anything the current thread didn't create).? I can probably live easier if I didn't need to support multiple threads, but I really really really want to try and go for multiple threads.? It's partly generational, bit of a mark-and-sweep.? I'm wondering about mark-and-compact but it would probably be difficult without using handles, or using some kind of rather slowish scheme to figure out where a memory address now is.


Normally, the GC is called by an allocator when it can't immediately get the memory it needs. It can also be invoked on demand by the programmer. If those are the only times, then all you need to do is lock the heap (well, actually the GC itself) so that a second attempt to GC must wait for the other thread to finish. It is defiinitely hard if you want to run the GC concurrently in its own thread. For the last case, I would suggest researching what others have done.


I'm developing my own "bytecode" because I'm too lazy to go all the way to writing a "real" assembler.? Besides, it strikes me as being ridiculously difficult to figure out how to encode label as a LISP S-expression.


So (opcode register displacement index1 index2 scale immediate) and using NIL for missing components is not an answer?
Posted on 2005-05-10 00:08:09 by tenkey


Currently I'm trying to struggle with garbage collection, and I think the algo I've thought up sucks (wanting to support multiple threads without being able to view their registers is rather bad, since this means I can't safely throw away anything the current thread didn't create).? I can probably live easier if I didn't need to support multiple threads, but I really really really want to try and go for multiple threads.? It's partly generational, bit of a mark-and-sweep.? I'm wondering about mark-and-compact but it would probably be difficult without using handles, or using some kind of rather slowish scheme to figure out where a memory address now is.


Normally, the GC is called by an allocator when it can't immediately get the memory it needs. It can also be invoked on demand by the programmer. If those are the only times, then all you need to do is lock the heap (well, actually the GC itself) so that a second attempt to GC must wait for the other thread to finish. It is defiinitely hard if you want to run the GC concurrently in its own thread. For the last case, I would suggest researching what others have done.


Actually I do plan on locking the heap (or rather, my heaps).? My problem is that I don't know how to access the stack/registers of another thread.? Since in theory I could share data with another thread (say, via a global), I want to make sure I don't trash anything that might be used by another thread.? The scheme I've worked out initially trashes the most recent data allocated in the current thread (the one which called the GC), then recovers anything referenced from the root set.? The root set is:
1. the virtual V register (intel EBP)
2. the stack
3. all memory areas currently allocated

Since we free only from the current thread, anything allocated on another thread would be part of the root set (as per #3) and would recover any memory passed to it.

Unfortunately it strikes me that #3 would take a long time, which is why I'm not very hot on this....



I'm developing my own "bytecode" because I'm too lazy to go all the way to writing a "real" assembler.? Besides, it strikes me as being ridiculously difficult to figure out how to encode label as a LISP S-expression.


So (opcode register displacement index1 index2 scale immediate) and using NIL for missing components is not an answer?


I realized an even better option.? As it happens, the "mutator" (my bytecode-level optimizer) would have an easier time if I were able to give a more "normal" (intel-based) syntax.? I decided on the following:
1. Registers are symbols A B C D (corresponding to eax ebx ecx edx), V (ebp), X Y (esi edi), S (esp).
2. Memory references will be enclosed in a list.? For instance, DWORD PTR would be (A).? All objects in such a memory reference list would be "added" together, in the same sense that the assembler "adds" foo[4].
3. An "embed" character, #, will imply that the succeeding object's numeric value (the pointer) will be included.? Since small numbers encodable in 16 bits would be an object too, we could encode:
mov DWORD PTR [4],5
as:
(MOV (A #4) #5)
and:
mov DWORD PTR z[8],eax
as:
(MOV (#z #8) A)
this allows us to use A, B, C, D etc. as both a symbol and a register name.
4. scaled indices would themselves be in lists:
mov DWORD PTR foo,3
=>
(MOV (#foo (A 4)) 3)



OMG tenkey we think alike!!!

I have worked out a GC (Garbage Collection) scheme for objects. Each object has an embedded size field because it's simpler and faster to GC than if the size is stored in a class or "type" table. Because I'm allocating in DWORDs, the size has two redundant bits (they are always 0). If I use MOVSD, I can eliminate the two bits by storing the DWORD count instead of the object size. And because the objects can only be allocated in user memory (the lower 2G), there is another redundant bit. I can use one bit to indicate that an object has been collected, and another bit to indicate whether the object contains pointers only or no pointers. The all-or-none pointer property of objects is also a way to simplify GC. The GC will probably be the only code that actually uses this size-and-attributes field.

The rather creepy part is that my defines are done this way:

GC_USEDBIT EQU 80000000h
GC_DUMBBIT EQU 40000000h
GC_GENBIT EQU 20000000h
GC_LENGTH EQU 1fffffffh

My reasoning was the same as yours: two bits free because of DWORD boundaries, and another bit because it's not likely to reach addresses in the 80000000h range.

My "USEDBIT" corresponds to "an object has been collected", my "DUMBBIT" corresponds to "contains pointers only or no pointers."  The "GENBIT" was free and allocated for generational GC.  I do have an additional field, for a pointer to the previous allocated memory area, since I need to keep track of which areas were allocated in the current thread.



Posted on 2005-05-12 01:13:14 by AmkG

:shock:

"on a different island"

!!!

goodluck

The country where I live, the Philippines, is an archipelago composed of several islands.  And actually, she's my girlfriend.  Sort of.  Err.  Gosh, I guess I know more about hashing functions than about that sort of thing.  :P :P :lol:
Thanks for the luck anyway.
Posted on 2005-05-12 05:29:54 by AmkG

Actually I do plan on locking the heap (or rather, my heaps).? My problem is that I don't know how to access the stack/registers of another thread.? Since in theory I could share data with another thread (say, via a global), I want to make sure I don't trash anything that might be used by another thread.? The scheme I've worked out initially trashes the most recent data allocated in the current thread (the one which called the GC), then recovers anything referenced from the root set.? The root set is:
1. the virtual V register (intel EBP)
2. the stack
3. all memory areas currently allocated

Since we free only from the current thread, anything allocated on another thread would be part of the root set (as per #3) and would recover any memory passed to it.

Unfortunately it strikes me that #3 would take a long time, which is why I'm not very hot on this....


I forgot about the registers. The compiler needs to ensure all heap pointers loaded into registers are rooted in memory. Heavy optimization may mean that a compacting or copy collector is not a good idea.

One issue I had with putting heap pointers on the ESP stack was the nonheap data that exists in a callback (window proc). I wanted a heap that worked in a GUI app. I was looking at two options: creating a second EBP chain that chained together application stack frames, and implementing the stack in the heap.
Posted on 2005-05-15 18:52:11 by tenkey
As to the registers, I followed STDCALL's rule that the called procedure/function can trash all registers except ebp (V), ebx (B), esi (X) and edi (Y).  I modified the rule so that only V is saved by a macro call, and Y can never be used because it keeps track of the memory areas allocated in the current thread.  Hence, the macros must assume that all registers are trashed if they embed a function call; they must save any registers on the stack, at least.  In case the function call is actually a macro that expands to bytecode that isn't a function call, removing the saving of registers would be the task of my "mutator" or bytecode-level optimizer.

Since I want my LISP to also be able to communicate well with windows (using the IA32 native code rather limits me to that OS, after all), my GC will have to check my stack's contents if they are indeed pointers to memory areas.  A second problem is with function calls.  Function return addresses point to the middle of a code area, so I will have to figure out some way to get the start of a code area from a function return address.  In theory, after all, I could release a function from all memory areas while that function is calling me.  Still, the function isn't free because it's still running.

Because I want to have lexical binding and persistent environments in my LISP, my local variables will have to be on the heap too, which is just too bad.  I guess I can rewrite my 'fn macro (equivalent to 'lambda on other LISPs) so that it will attempt to guess if the function's environment could possibly persist, and if definitely not, will keep locals on the stack.

Thanks for the interest!

Posted on 2005-05-15 21:37:59 by AmkG