at the beginning: i thank everyone who will read this (i don't want this thread to be another unanswered one)

the system-functions in my os can currently be called by the call gate int 30h. up to now i'm using register calling convention (i.e. i pass all parameters via the GPRs and esi/edi). this is damn unflexible and uncomfortable. i was thinking about a better calling convention using the stack and al for a function-number.

here's my sketch:

push param2
push param1
push param0 ;reverse push the params
mov al,0 ;the function-number
int 0x30
add esp, 3*4 ;correct the stack

movzx eax,al
call [calltable+eax*4]

dd function0
dd function1

push ebp
lea ebp, [esp+20] ;point to beginning of params
;do something
; push [ebp+4] ;param1
; push [ebp] ;param0
; [b]call function1[/b] ;<-- not possible (explaining below)

push [ebp+4] ;param1
push [ebp] ;param0
mov al,1
int 0x30 ;the only possibility... pushing another 20 bytes

pop ebp


i know that it costs 20 bytes on the stack just for calling without parameters. furthermore if a system-functions wants to call another system-function (for example a line-function needs to putpixel) it has to go over int30 too since the functions assume 20 bytes between esp and the params and so i can't call them directly (marked bold above).
i somewhere heard that such callgates being called from ring3 mix up the cache making things slower?

the questions:
- is this convention a good one?
- is there a better (remember that i'll keep my ring3-callgate)
- will it work from ring3?

topic switch, but still my os. this is a taskstate-segment:

tss (104 bytes):
dw previous task link, reserved
dd esp0
dw ss0, reserved
dd esp1
dw ss1, reserved
dd esp2
dw ss2, reserved
dd cr3
dd eip
dd eflags
dd eax
dd ecx
dd edx
dd ebx
dd esp
dd ebp
dd esi
dd edi
dw es, reserved
dw cs, reserved
dw ss, reserved
dw ds, reserved
dw fs, reserved
dw gs, reserved
dw ldt segment selector, reserved
dw trace:0, i/o map base address (=sizeof(TSS))

i don't know exactly the usage of ss0:esp0, ss1:esp1 and ss2:esp2.
from the intel software developer's guide:

[b]Privilege level-0, -1, and -1 stack pointer fields:[/b]
These stack pointers consist of a logical address made up of the segment
selector for the stack segment (ss0, ss1, ss2) and and offset into the stack
(esp0, esp1, esp2). Note that the values in these fields are static for a
particular task: whereas, the ss and esp values will change if stack switching
occours within the task.

can somebody tell me what that means? i hope it doesn't say that i need 4 stacks, one for ring3, one for ring2 and so on! is the stack switched when a ring3-task switches to ring0? (i.e. the ss0:esp0-values are used until switching back?), i hope its not like that since i don't want to make 4 stacks...

thanks for reading, thanks for thinking
Posted on 2003-07-14 14:09:08 by hartyl
Good to have C calling convension! Yes, there must be different stacks - what kind of protection would there be otherwise! :)
Posted on 2003-07-14 16:01:05 by bitRAKE
Indeed STDCALL is a good calling convention... unfortunately yes there is a separated stack for each ring switch via TSS, :(

Besides Callgates are NOT slow, FreBSD /BSD is using callgates as opposed to Linux int/register parameters and it is much faster... unfortunately mixing stack based and register based (al) parameters is not my taste ...

Also using callgate and INT might not be that god a simple callgate in GTD / IDT could work better IMHO
Posted on 2003-07-14 16:24:30 by BogdanOntanu

Also using callgate and INT might not be that god a simple callgate in GTD / IDT could work better IMHO

how would that work? the callgate is just an entry in the idt with some special bits set. what would you callgate look like using the gdt and idt?

since this is a ring3-callgate it sould be possible to call from there, right (currently have no chance to test it yet)?
do you have some ideas of reducing the amount of bytes being pushed on the stack (20 for every call is pretty much - nothing for recursive functions).

another question: how is the normal calling convention done? somehow the stack is corrected inside the function, you don't need to do it after the call - i have no idea how to do that, so: can somebody tell me?
Posted on 2003-07-15 04:03:25 by hartyl

forget about the 20 bytes and get it running now.

Mixing stack and register based parameters is not that pretty. I
recommend you C calling too. Register based calling was even a
pain under MC680x0/Amiga with 14 registers (d0-d7, a0-a5).

Do you really want to use ring1 and ring2? For what purpose?
Ofcource there are 4 rings but I'd rather use ring3 and ring0 only.

Bye Miracle
Posted on 2003-07-15 04:24:50 by miracle
you are right, ring1 and ring2 are not neseccary.
the idea is now the following:

| stack
| -------
data | |
sel | |
| code |
| sel |
| |

i want to make 2 selectors for every task. one data and one code-selector. the top, say 4kb of the data-segment are the stack, so esp points to the end of the data-segment.
so, if i have now a ring3 task, the selectors have DPL=3 (sure). since i don't want to make a own stack for ring0, is it possible to half the stack (esp points to top of data-segment, esp0 is 2 kb lower) and let the ring0-stack use the ring3-segment as well? (please say yes :))
i hope i explained it ok...

@miracle: could you explain me who exactly the (original) c-calling works?
actually i don't mix register- and c-calling, just the function-number is in al - eax is trashed usually.
Posted on 2003-07-15 05:50:38 by hartyl
1)IF you have the same stack for ring3 app and ring0 app then protection will suffer a lot, any ring3 application will be able to easyly crash OK kernel by missbalancing the stack.
2)IMHO (i might be wrong though you could use Callgates in GDT also --

However i do not use them --so i might be very wrong--

Mainly this is why i use only ring0 code in my OS :D

IMHO the ring3 to ring 0 switch will leave you the address of the old ring3 stack segment/pointer on stack and you could use that to read parameters of the ring3 API call from the ring0 implementation.
Posted on 2003-07-15 05:59:59 by BogdanOntanu

1)IF you have the same stack for ring3 app and ring0 app then protection will suffer a lot, any ring3 application will be able to easyly crash OK kernel by missbalancing the stack.

huh?! if a ring3 program generates a stack-overflow it will just crash with a gpf - the program, not the kernel (i think...)
i dunno if i explained it ok, but

---------- <-- esp
| r3-stack
|---------- <-- esp0
| r0-stack
| -------
data | |
sel | |
| code |
| sel |
| |
Posted on 2003-07-15 06:27:15 by hartyl
The C-Calling convension:
   push 3

push 2
push 1
call _proc
; could use data on stack
add esp, 3*4


mov eax, [esp + 4]
inc eax
xor [esp + 8], eax ; change data on stack for return values
retn ; don't correct stack
Main feature being the parameters are left on the stack by the called function. This makes stack clean up the job of the programmer/compiler -- they put the data on the stack, therefore they clean it up. Also, note that item on the stack could be changed and the program MOV/POP them to get results! Or data can be left on the stack for multiple CALLs in a row! There is much flexiblity.
Posted on 2003-07-15 08:34:25 by bitRAKE
multiple calls in a row, eh? sounds good :)
when travelling home from work i thought about this stack-switching thingy in combination with my calling convention - and i noticed its impossible (as i wanted to implement it).
follow my thoughts: i have a ring3-task wanting to call a system-function with parameters. first the params are reverse-pushed on the stack, the function number is set and finall int30 is called, the task enters ring0. and here comes the stack-switch! (due to changing the privilege level) ss:esp is filled with ss0:esp0 (where is then the ss:esp saved?). things wont work since the parameters are not on the ring0 stack, they are on the ring3-stack.
i'm at the end of my ideas - is there a possibility to trick around? or just to *disable* the stack-switching (ya, i hate this function).

greets, hartyl
Posted on 2003-07-15 14:37:27 by hartyl
Well, you could do it like this:
movzx eax,al
push ebp
mov ebp,esp
mov esp,
mov ,esp

The great thing about this is that you can call the procedures directly from system code. And you may want to run the system code at Ring 1, so you can trap stack faults at Ring 0, since you'll now be using the user's stack.
Posted on 2003-07-15 15:56:47 by Sephiroth3
Sorry Hartyl,
But from your initial ascii art ... i thought you are using a single stack for both ring3 and ring0 code, my mistake :(

One of the extra features of C calling conventions is the fact that IF you push the WRONG number of parameters it "might" not get a crash.

This is because the stack will NOT be unbabanced at end if you still clear it corectly.

So let's say a function will eventually print some grabbled text on screen but will not GPF.

However this is not so TRUE anymore in an protected environment because the wrong text pointer might generate a GPF because it wonders in protected land data ...

Adding the hassle you have to always balance the stack by hand after every procedure/function call...i dislike it :)

Maybe only C compilers should use such calling convetion :) ---eh and a macro eventually ...
Posted on 2003-07-15 22:22:31 by BogdanOntanu

i thought you are using a single stack for both ring3 and ring0 code, my mistake :(

um... that's what i wanted to do. the dataselector contains the two stacks (for ring0 and ring3) and the data, the code-segment ends right before the stacks, but the base of it is equal to the data's base.

i took a look at the call gates yesterday. you have an idt-entry where some bits represent the number of parameters to be transfered between the stacks during the stack switch - but that's not a dynamic value. it then looks like this on the new stack: ss:esp (from old stack), parameters, cs:eip.
so, it's completely different wheather i call from ring3 or from ring0 - and i can't get it out where the function has been called from.
in intel's architecture system programming guide i saw a graphic showing when there's a stack-switch and when not. there were the different privilege levels shown in levels. it showed that a call (via call gate) from ring3 causes a stacks-switch but ring2 (ya, two) doesn't; so.. why? (i had no time reading the text above and below...)

different view: is there a possibility to prevent stack switches?

yet another view: what if i make 2 call gates, one can be called from ring3 - the gate *knows* that and can fiddle out the old ss:esp. the other one is for system-internal calls or for ring0, it doesn't need to do that.

last try: i know that the last values pushed on the stack are cs:eip of the calling task (due to the int30-call). i can check cs for its DPL and if it's == 3, again, i know that there was a stack-switch and can look for the old ss:esp.

please *help* me :eek:
Posted on 2003-07-16 02:02:53 by hartyl
i got another approach - but this is the best (i think):
if i make a call gate with the params-bits set to 0 (i.e. nothing is transferred between the stacks) it will look like this on the new stack when a stack switch occours:


if there's no stack-switch occours:


by getting cs i can check the CPL before the call (cs & 0x0003). if it was 0, i don't need to fiddle around. if it was 3 i get the old ss:esp by swapping them with the new values. then i call the function with my calltable - everything will be processed on the old stack. when returning and the CPL was 3 on calling i swap them back - assuming the stack is still balanced.

could that work now? (oh man, i have many ideas here... :))
the last question then: how is a callgate called?

call CALLGATE_SEL:0x00000000


thanks all
Posted on 2003-07-16 06:22:24 by hartyl
Well, obviously you don't need a call gate to call system code from other system code. This code could just call the other code directly.
And when the system call returns, it doesn't matter whether the stack is balanced or not, since you can just update the old esp.
Posted on 2003-07-17 14:37:16 by Sephiroth3
i dunno, but i need a callgate for every systemfunction?
Posted on 2003-07-18 13:23:19 by hartyl
No, you can put the function number in the offset part. Then you would do this in the called code:

mov eax,
mov eax,
Posted on 2003-07-18 14:02:56 by Sephiroth3
afaik, then all function must have the same number of bytes as parameters since the amount of memory copied is set in the gdt-entry, i would not know how to make this universal for every function.
the callgate-way seems to be very easy - if you have a good idea for the parameters, i'll use em.
Posted on 2003-07-19 12:49:47 by hartyl
You'd have to have one callgate for functions that take 1 parameter, another for functions that take 2 etc. Or you could use the technique with interrupts that I posted above
Posted on 2003-07-19 13:57:36 by Sephiroth3
i don't want to make a callgate for every parameter-count... that's complicated.
well, i've already implemented the interrupt-gate-way. but i don't like it that i have to put 20 bytes on the stack for just a call without parameters (and i want to preserve the
registers). is there a way out with the interrupt gates?
Posted on 2003-07-19 20:53:05 by hartyl