So I'm working on a DOS project (writing a game engine in 8086 assembler, to be specific.) I'd like to build it as a COM file, mostly to save myself the hassle of linking individual bits together. If Wikipedia is to be believed, DOS's memory-management facilities are not available to COM programs; it states that "all memory is available to the application." This suits me fine, as I wanted a couple facets of memory management to line up with the rest of the engine more neatly, and it wasn't all that hard to hack up replacements for the routines I need. The thing is, I'm also trying to write the engine to play nicely with its operating environment; I want it to be able to make a nice clean exit and leave the computer in exactly the same state it found it, and I want to make sure that I know how to do my own memory management without clobbering any drivers or other TSR software loaded into conventional memory.
I'm currently figuring that it's a safe assumption that the program will be loaded fairly low in memory, so I don't need to bother with reading or writing below the start of the program segment, which means that I only need to worry about stuff that's been loaded under the top boundary of the conventional memory area (it's my understanding that this is where DOS and drivers tend to go.) I see an entry in the Program Segment Prefix for memory size, and what I'm wondering is if this reflects the actual, physical amount of conventional memory installed, or the amount free when my program was loaded; i.e., can I rely on that value to tell me where I should not write past?
I'm currently figuring that it's a safe assumption that the program will be loaded fairly low in memory, so I don't need to bother with reading or writing below the start of the program segment, which means that I only need to worry about stuff that's been loaded under the top boundary of the conventional memory area (it's my understanding that this is where DOS and drivers tend to go.) I see an entry in the Program Segment Prefix for memory size, and what I'm wondering is if this reflects the actual, physical amount of conventional memory installed, or the amount free when my program was loaded; i.e., can I rely on that value to tell me where I should not write past?
Check out the PSP in AoA 16 for a more detailed explanation.
I think in general, you are safe to stay within the 64KB segment that is assigned to your COM. DOS is a single-tasking OS, and unless you have many TSR's loaded, this should be a workable assumption.
Anything beyond that, and you are going to want to utilize the DOS INT 0x21 API. In particular to your question, take a look at INT 0x21, Function 0x48 - Allocate Memory; the notes in that link also have a few hints for you.
As for Wikipedia's statement, it is more to the extent that DOS' memory-management facilities are not implicitly available to COM programs.
I think in general, you are safe to stay within the 64KB segment that is assigned to your COM. DOS is a single-tasking OS, and unless you have many TSR's loaded, this should be a workable assumption.
Anything beyond that, and you are going to want to utilize the DOS INT 0x21 API. In particular to your question, take a look at INT 0x21, Function 0x48 - Allocate Memory; the notes in that link also have a few hints for you.
As for Wikipedia's statement, it is more to the extent that DOS' memory-management facilities are not implicitly available to COM programs.
Hmm. I see in the Int 21/48 documentation that "COM programs are initially allocated the largest available memory block" - that would seem to support Wikipedia's description. Is the COM file itself just loaded into the first <= 64KB of that block, then? If not, how do I find out the location and size of the block?
Before you can allocate anything in a dos .com program (and most .exe programs), you'll have to do a "resize memory block" interrupt. After that the "malloc" interrupt should work normally. (Wikipedia doesn't know everything. :) )
You can almost(!) certainly use the entire 64k block - but your stack is at the top of it. You can arbitrarily use memory above that... at some small risk of trashing a TSR or so. The "right" way is to "resize memory block" first (keep the whole 64k unless you're desperate for every scrap of memory), and then "malloc" as usual.
(according to my possibly faulty memory)
Best,
Frank
You can almost(!) certainly use the entire 64k block - but your stack is at the top of it. You can arbitrarily use memory above that... at some small risk of trashing a TSR or so. The "right" way is to "resize memory block" first (keep the whole 64k unless you're desperate for every scrap of memory), and then "malloc" as usual.
(according to my possibly faulty memory)
Best,
Frank
Hmm, okay. Thanks for the information!
Oh, one other thing. If I am going to resize the application block down to something smaller than 64KB, will I have to manually move the stack pointer first?
Oh, one other thing. If I am going to resize the application block down to something smaller than 64KB, will I have to manually move the stack pointer first?
I would assume that DOS doesn't do anything else beyond the initial load. So yes, assume you need to move/preserve the stack prior to resizing.
After digging out, dusting off and reviewing several good old DOS programming references (see list below), I've compiled a bit more information on this subject...
A .COM program is initially allocated *all* available "transient program area" (TPA) memory, in one contiguous block (typically > 500KB), with the PSP at the bottom and the file image loaded immediately above that. The DS, SS, CS and ES segment registers are all loaded with this same base memory address == PSP. The IP is loaded with 100h which is the first byte of the COM file which must be an executable instruction. DOS sets the stack pointer to zero (which is effectively 10000h == 64KB) then pushes one word of zero onto the stack, so SP==0FFFEh when the program first runs (if the block is smaller than 64KB then DOS set the stack pointer to the top of the block minus 2). This zero on top of the stack serves as the return address for the COM program as a whole - i.e. if the COM program executes a NEAR RET instruction, then the zero is popped off the stack and used as the address to continue program execution - and this is where the first word of the PSP comes in (The PSP is located at offset zero). The first word of the PSP is a two byte instruction: "int 20", which is the old style method for program termination dating back to MS-DOS version 1.
An .EXE program is also initially allocated the whole TPA memory block by default. The EXE file has a header which includes two parameters set by the linker (the minimum required and the maximum desired memory), which tell the DOS loader how much extra memory (above and beyond that required by the program's code, data and stack), to allocate the program at run time. The linker sets this maximum to 0FFFFh by default, so the loader always gives it as much as it can - i.e. the whole TPA. When DOS loads an EXE program, the DS and ES segment registers are set to point to the PSP, the SS:SP and the CS:IP values are obtained from the EXE header structure prepared by the linker. (The runtime CS and SS values are equal to the values in the DOS header added to the image load address, which is the segment address of the top of the PSP.) Whew!
Here is the structure definition of the PSP from the MS-Dos Programmer's Reference:
(Note that this official definition leaves out one important detail: the first byte of the PspCommandTail is actually a count of bytes equal to the length of the string which follows.) But the part you are probably most interested in is the PspNextParagraph member which points to the top of the programs memory block. Here is what the Programmer's Reference has to say about it:
Thus, the memory block containing the program is typically much bigger than 64KB and can be computed by: (PSP.PspNextParagraph - DS) * 16.
DOS actually allocates two memory blocks for each program: one for the program and one for a copy of all the environment variables. The PspEnvironment member points to this other memory block which contains all the ASCIIZ environment variables placed in series and terminated with a zero length string.
There is no need to use the memory allocation functions unless you plan on giving some of the memory back. Regarding the DOS Int 21 memory allocation routines, there are three that are useful:
In all cases, each of these functions return with the carry flag bit clear if it was successful and set if there was an error. If there was an error, then AX contains the error code. If function 48h or 4Ah fails to allocate the requested size, then BX is returned with the largest available size.
So here is an example snippet (taken from Ray Duncan's book - see below) of a well-behaved .COM program written in ASM which correctly adjusts its stack and reduces its memory block size:
For low level DOS programming, I would highly recommend the following two Microsoft Press books (used versions are available for mere pennies + shipping):
MS-DOS Programmer's Reference
Advanced MS-DOS Programming By Ray Duncan
A .COM program is initially allocated *all* available "transient program area" (TPA) memory, in one contiguous block (typically > 500KB), with the PSP at the bottom and the file image loaded immediately above that. The DS, SS, CS and ES segment registers are all loaded with this same base memory address == PSP. The IP is loaded with 100h which is the first byte of the COM file which must be an executable instruction. DOS sets the stack pointer to zero (which is effectively 10000h == 64KB) then pushes one word of zero onto the stack, so SP==0FFFEh when the program first runs (if the block is smaller than 64KB then DOS set the stack pointer to the top of the block minus 2). This zero on top of the stack serves as the return address for the COM program as a whole - i.e. if the COM program executes a NEAR RET instruction, then the zero is popped off the stack and used as the address to continue program execution - and this is where the first word of the PSP comes in (The PSP is located at offset zero). The first word of the PSP is a two byte instruction: "int 20", which is the old style method for program termination dating back to MS-DOS version 1.
An .EXE program is also initially allocated the whole TPA memory block by default. The EXE file has a header which includes two parameters set by the linker (the minimum required and the maximum desired memory), which tell the DOS loader how much extra memory (above and beyond that required by the program's code, data and stack), to allocate the program at run time. The linker sets this maximum to 0FFFFh by default, so the loader always gives it as much as it can - i.e. the whole TPA. When DOS loads an EXE program, the DS and ES segment registers are set to point to the PSP, the SS:SP and the CS:IP values are obtained from the EXE header structure prepared by the linker. (The runtime CS and SS values are equal to the values in the DOS header added to the image load address, which is the segment address of the top of the PSP.) Whew!
Here is the structure definition of the PSP from the MS-Dos Programmer's Reference:
PSP STRUC ;program segment prefix
PspInt20 dw ? ;Int 20h instruction
PspNextParagraph dw ? ;segment address of next para
db ? ;reserved
PspDispatcher db 5 DUP (?) ;long call to MS-DOS
PspTerminateVector dd ? ;Termination address (Int 22h)
PspControlCVector dd ? ;CTRL+C handler (Int 23h)
PspCritErrorVector dd ? ;Critical Err handler (Int 24h)
dw 11 DUP (?) ;reserved
PspEnvironment dw ? ;seg addr of environment
dw 23 DUP (?) ;reserved
PspFCB_1 db 16 DUP (?) ;default FCB #1
PspFCB_2 db 16 DUP (?) ;default FCB #2
dd ? ;reserved
PspCommandTail db 128 DUP (?) ;command tail string (default DTA)
PSP ENDS ;program segment prefix
(Note that this official definition leaves out one important detail: the first byte of the PspCommandTail is actually a count of bytes equal to the length of the string which follows.) But the part you are probably most interested in is the PspNextParagraph member which points to the top of the programs memory block. Here is what the Programmer's Reference has to say about it:
PspNextParagraph
Specifies the segment address of the first paragraph immediately following the program. (This address does not point to the free memory available for the program to use.) Programs use this field to determine quickly whether they were allocated sufficient memory to run successfully
Specifies the segment address of the first paragraph immediately following the program. (This address does not point to the free memory available for the program to use.) Programs use this field to determine quickly whether they were allocated sufficient memory to run successfully
Thus, the memory block containing the program is typically much bigger than 64KB and can be computed by: (PSP.PspNextParagraph - DS) * 16.
DOS actually allocates two memory blocks for each program: one for the program and one for a copy of all the environment variables. The PspEnvironment member points to this other memory block which contains all the ASCIIZ environment variables placed in series and terminated with a zero length string.
There is no need to use the memory allocation functions unless you plan on giving some of the memory back. Regarding the DOS Int 21 memory allocation routines, there are three that are useful:
- Function 48h - Allocate Memory Call params: AH = 48h, BX = paragraphs (16 bytes each) to be allocated. Return value: AX = segment address of newly allocated block.
- Function 49h - Free Allocated Memory Call params: AH = 49h, ES - segment address of block to be freed. Return value: None. (CF clear if successful.)
- Function 4Ah - Set Memory Block Size Call params: AH = 4Ah, BX = new requested block size in paragraphs, ES - segment of block to be modified. Return value: None. (CF clear if successful.)
In all cases, each of these functions return with the carry flag bit clear if it was successful and set if there was an error. If there was an error, then AX contains the error code. If function 48h or 4Ah fails to allocate the requested size, then BX is returned with the largest available size.
So here is an example snippet (taken from Ray Duncan's book - see below) of a well-behaved .COM program written in ASM which correctly adjusts its stack and reduces its memory block size:
org 100h
main proc far ;entry point from DOS (ES => PSP)
mov sp, OFFSET stk ;adjust stack pointer
mov ah,4ah ;func 4Ah - Modify Memory Block
mov bx,400h ;400h paras == 16,384 bytes
int 21h ;call DOS function 4Ah
jc error ;jump if function failed
. ;our memblk is now 16KB
.
error: . ;handle error
.
dw 256 dup (?) ;256 byte stack
stk equ $ ;base of new stack
.
.
For low level DOS programming, I would highly recommend the following two Microsoft Press books (used versions are available for mere pennies + shipping):
MS-DOS Programmer's Reference
Advanced MS-DOS Programming By Ray Duncan
Okay. That's kind of what I had assumed, but it's good to get all the details. Thanks for the help!
commodorejohn,
There is also word at PSP:6, it contains number of bytes available in segment for .Com program. Usually it contains value around 0xFF00, but when .Com is loaded in UMA, it depends on actual memory block size.
There is also word at PSP:6, it contains number of bytes available in segment for .Com program. Usually it contains value around 0xFF00, but when .Com is loaded in UMA, it depends on actual memory block size.