I don't know if this belongs here. If not please move or delete it ;)

I have a question.. I'm writing a compiler myself and I got a simple base now like calling API functions, variables, arrays and so on. Now I want to take it a step further and implement OOP but I have one big problem.. I guess I don't get the point how this should work. When I take a look at ObjAsm to figure out how this works I just end up totally confused. I think its because I am confused by all these macros and cannot figure it out how all this works together. Does someone know where I could find some simple explanation how things work or some technical documentation on OOP?

I came up in my mind with a solution myself.. it would work but I don't think it is called OOP "standard"

my idea was:

for example we have the class DOG

class Dog
color dword

func bark()
end class

now this is compiled as template and linked to the pe file
if then you would do

x new dog

the mem sizeofclassdog would be allocated
the variables would be go first to this memblock by copymemory from codesection
and then the bark function would be copied to mem
dog.color.addr would have now address from mem stored and if dog.color is accsessed it modifies memory
dog.bark.addr also address of mem would now be used for

but I see the problem when

func bark()
color = 2
end func

would happen... because.. now.. some mov ,2 would happen...
so I decided.. this must be the wrong way....

I would be pleased if some smart mind could give me just some simple examples how to create a simple class in mem
how to format and some baisc code maybe even if in assembler if I get the raw idea I think I could finally get it..

Thanks for reading,
Posted on 2007-11-08 18:02:44 by Emod
you should first separate few points of OOP:

1. objects
2. automatic resource freeing
3. structured exception handling

each of these is a separate issue.

Each object can have data and methods. Methods that are not virtual (can't be overloaded) can be called as normal functions. Virtual (overloadable) methods must be pointed by vtab (virtual table). Virtual table is array of pointers to methods.

Each class with virtual methods has it's virtual table. If object has some virtual methods, then very first (hidden from programmer) variable is pointer to class's virtual table. This pointer can be also used as run-time type info, because it is unique for every class. If object has some data, they are following the pointer to virtual table.

If class X inherits class Y, then both object of X and virtual table of X must be compatible with that of Y's. That means, order of things (variables and pointers to virtual methods) must be same. Of course, there can be more data and methods added at the end by X, that is the point of inheritance.

When calling virtual method, you must load pointer to according function (method) for your object from virtual table it points to.

2: Simply, your compiler has to detect all resource aquisition, and automatically insert code for resource freeing at all appropriate places (including throwing exception).

3: This is a big topic itself, and there are many ways to do it, all have some advantages and disadvantages. I won't discuss it now much.
Posted on 2007-11-08 19:48:07 by vid
Thanks vid helped me a lot espacially the tip with the virtual table. I did not have any clue about this before. Guess I will have a deeper look in this topic and look for more information on this. I hope I can figure a way out.
Posted on 2007-11-09 01:13:08 by Emod
Emod: C++ has a "thiscall" calling convention for member functions. The standard on x86 is to load "this" (ie., pointer to the object instance) into ECX. Then you reference all data through ECX+offset instead of static offsets.
Posted on 2007-11-09 06:51:33 by f0dder
Well let me tell what I got so far from reading. I did not know anything about all this before and how this must be handled by compiler I just used classes before.

Reading about Virtual Table always ended up in COM Objects where I think I understand. you just have an IUnknown interface getting the first 3 dwords which are pointers to 3 function queryinteface addref and release so you must call ptr to ptr.. and.. whatever maybe then you get the interface or something. I stopped reading at this point because I was totally confused. Ok if I use a vtable for classes I think its something totally different? isn't it? I just have to do my own table without the QueryInterface and all this stuff but the principle should be the same? or does every class need such a vtable then so that he accesses all functions and data of the class through this inteface?

hmm f0dder did you mean something like

class dog
dword color
func changecol
this.color = 2
end func
end class

Posted on 2007-11-09 08:05:13 by Emod
some examples:

this is some OOP pseudocode:

class a
 method m1();
 virtual method m2();
 virtual method m3();
 int data1;
class b inherits a
 overloaded method m2();
 overloaded method m3();
 method m4();
 virtual method m5();
 int data2;

in pseudo assembly:

;method a::m1()
proc a__m1  ...

;method a::m2()
proc a__m2  ...

;method b::m2()
proc b__m2  ...

;method a::m3()
proc a__m3  ...

;method b::m3()
proc b__m3  ...

;method b::m4()
proc b__m4  ...

;method b::m5()
proc b__m5  ...

;virtual table for a
 dd a__m2   ;pointer to implementation of method m2
 dd a__m3   ;pointer to implementation of method m3

;virtual table for b
 dd b__m2
 dd b__m3
 dd b__m5  

;instance "foo" of class "a"
dd a__vtab  ;pointer to vtab
dd 10   ;value of data1 member

;instance "bar" of class "b"
dd b__vtab  ;pointer to vtab
dd 10   ;value of data1 member
dd 15   ;value of data2 member

;calling non-virtual method foo.m1()
mov ecx, foo
call a__m1

;calling foo.m3()
mov ecx, foo
mov eax, dword ptr   ;eax = address of virtual table
call dword ptr  ;call pointer from virtual table

;calling bar.m3() - same as previous
mov ecx, bar
mov eax, dword ptr   ;eax = address of virtual table
call dword ptr  ;call pointer from virtual table

As you can see the last 2 pieces of code, calling of virtual method for instance of class a, is same as calling virtual method for instance class b. That is because class b inherited class a, and everything that is in class a object, is also in class b object, and in same order.

That means you can write procedure that will work with class "a", and then pass to it instance of class "b", and it will work fine. That procedure doesn't have to know anything about class "b", and still you can use it with it. This is one of most important points of OOP.
Posted on 2007-11-09 08:09:16 by vid
Wow, genius! :shock: thank you that's some great and simple code! Thanks for this piece I will study it and see what I can get and make of it. I'll let you know. Thanks for taking the time to write this down and helping me!
Posted on 2007-11-09 08:18:43 by Emod
yes, the way is very similar to COM objects. Of course, all COM objects inherited IUnknown with it's 3 methods, but you can use same idea and not inherit IUnknown.

Ok if I use a vtable for classes I think its something totally different? isn't it?

no, COM used vtable only for classes too. When you get an object instance, first member is pointer to vtable. There is only one vtable per object. Data are contained in object instance, along with pointer to vtable, not in vtable.

Try it yourself: allocate two instances of same COM class, and compare it's pointers (first dword). You will see it is same

Data (member variabes) are stored in object instance, so every object can have different values of them. But methods are same for every instance of same class, so storing pointers to methods in object instance would be unscessary wasting of memory. For that reason, they are stored in single place, in virtual table, and object instance just holds pointer to that virtual table.

In COM objects, every single method was in virtual table. That is because COM was used for inter-language and inter-process communication. In case when you don't need this, you only must place virtual (overloadable) methods to virtual table. Others methods can be can be called directly as procedures.

Reason why virtual procedures must be placed in virtual table can be seen in my previous example. Imagine some procedure which works with class "a" objects. You can pass both instance of class "a" or instance of class "b" to it. If such procedure wants to call method m2(), it should call "a__m2" if object is instance of class "a", and it should call "b__m2" if object is instance of class "b". This is easily accomplished by placing pointers into virtual table.

Hope i didn't confuse you too much :)

PS: I updated the example code, there was slight mistake in it
Posted on 2007-11-09 08:23:36 by vid