So I'm working on a DOS project (writing a game engine in 8086 assembler, to be specific.) I'd like to build it as a COM file, mostly to save myself the hassle of linking individual bits together. If Wikipedia is to be believed, DOS's memory-management facilities are not available to COM programs; it states that "all memory is available to the application." This suits me fine, as I wanted a couple facets of memory management to line up with the rest of the engine more neatly, and it wasn't all that hard to hack up replacements for the routines I need. The thing is, I'm also trying to write the engine to play nicely with its operating environment; I want it to be able to make a nice clean exit and leave the computer in exactly the same state it found it, and I want to make sure that I know how to do my own memory management without clobbering any drivers or other TSR software loaded into conventional memory.

I'm currently figuring that it's a safe assumption that the program will be loaded fairly low in memory, so I don't need to bother with reading or writing below the start of the program segment, which means that I only need to worry about stuff that's been loaded under the top boundary of the conventional memory area (it's my understanding that this is where DOS and drivers tend to go.) I see an entry in the Program Segment Prefix for memory size, and what I'm wondering is if this reflects the actual, physical amount of conventional memory installed, or the amount free when my program was loaded; i.e., can I rely on that value to tell me where I should not write past?
Posted on 2010-08-14 14:54:02 by commodorejohn
Check out the PSP in AoA 16 for a more detailed explanation.

I think in general, you are safe to stay within the 64KB segment that is assigned to your COM. DOS is a single-tasking OS, and unless you have many TSR's loaded, this should be a workable assumption.

Anything beyond that, and you are going to want to utilize the DOS INT 0x21 API. In particular to your question, take a look at INT 0x21, Function 0x48 - Allocate Memory; the notes in that link also have a few hints for you.

As for Wikipedia's statement, it is more to the extent that DOS' memory-management facilities are not implicitly available to COM programs.
Posted on 2010-08-14 15:24:47 by SpooK
Hmm. I see in the Int 21/48 documentation that "COM programs are initially allocated the largest available memory block" - that would seem to support Wikipedia's description. Is the COM file itself just loaded into the first <= 64KB of that block, then? If not, how do I find out the location and size of the block?
Posted on 2010-08-14 16:48:32 by commodorejohn
Before you can allocate anything in a dos .com program (and most .exe programs), you'll have to do a "resize memory block" interrupt. After that the "malloc" interrupt should work normally. (Wikipedia doesn't know everything. :) )

You can almost(!) certainly use the entire 64k block - but your stack is at the top of it. You can arbitrarily use memory above that... at some small risk of trashing a TSR or so. The "right" way is to "resize memory block" first (keep the whole 64k unless you're desperate for every scrap of memory), and then "malloc" as usual.

(according to my possibly faulty memory)

Best,
Frank

Posted on 2010-08-14 19:22:37 by fbkotler
Hmm, okay. Thanks for the information!
Posted on 2010-08-14 19:44:27 by commodorejohn
Oh, one other thing. If I am going to resize the application block down to something smaller than 64KB, will I have to manually move the stack pointer first?
Posted on 2010-08-16 15:07:52 by commodorejohn

Oh, one other thing. If I am going to resize the application block down to something smaller than 64KB, will I have to manually move the stack pointer first?


I would assume that DOS doesn't do anything else beyond the initial load. So yes, assume you need to move/preserve the stack prior to resizing.
Posted on 2010-08-16 18:46:13 by SpooK
After digging out, dusting off and reviewing several good old DOS programming references (see list below), I've compiled a bit more information on this subject...

A .COM program is initially allocated *all* available "transient program area" (TPA) memory, in one contiguous block (typically > 500KB), with the PSP at the bottom and the file image loaded immediately above that. The DS, SS, CS and ES segment registers are all loaded with this same base memory address == PSP. The IP is loaded with 100h which is the first byte of the COM file which must be an executable instruction. DOS sets the stack pointer to zero (which is effectively 10000h == 64KB) then pushes one word of zero onto the stack, so SP==0FFFEh when the program first runs (if the block is smaller than 64KB then DOS set the stack pointer to the top of the block minus 2). This zero on top of the stack serves as the return address for the COM program as a whole - i.e. if the COM program executes a NEAR RET instruction, then the zero is popped off the stack and used as the address to continue program execution - and this is where the first word of the PSP comes in (The PSP is located at offset zero). The first word of the PSP is a two byte instruction: "int 20", which is the old style method for program termination dating back to MS-DOS version 1.

An .EXE program is also initially allocated the whole TPA memory block by default. The EXE file has a header which includes two parameters set by the linker (the minimum required and the maximum desired memory), which tell the DOS loader how much extra memory (above and beyond that required by the program's code, data and stack), to allocate the program at run time. The linker sets this maximum to 0FFFFh by default, so the loader always gives it as much as it can - i.e. the whole TPA. When DOS loads an EXE program, the DS and ES segment registers are set to point to the PSP, the SS:SP and the CS:IP values are obtained from the EXE header structure prepared by the linker. (The runtime CS and SS values are equal to the values in the DOS header added to the image load address, which is the segment address of the top of the PSP.) Whew!

Here is the structure definition of the PSP from the MS-Dos Programmer's Reference:

PSP                     STRUC           ;program segment prefix
PspInt20                dw      ?               ;Int 20h instruction
PspNextParagraph        dw      ?               ;segment address of next para
                       db      ?               ;reserved
PspDispatcher           db      5 DUP (?)       ;long call to MS-DOS
PspTerminateVector      dd      ?               ;Termination address (Int 22h)
PspControlCVector       dd      ?               ;CTRL+C handler (Int 23h)
PspCritErrorVector      dd      ?               ;Critical Err handler (Int 24h)
                       dw      11 DUP (?)      ;reserved
PspEnvironment          dw      ?               ;seg addr of environment
                       dw      23 DUP (?)      ;reserved
PspFCB_1                db      16 DUP (?)      ;default FCB #1
PspFCB_2                db      16 DUP (?)      ;default FCB #2
                       dd      ?               ;reserved
PspCommandTail          db      128 DUP (?)     ;command tail string (default DTA)
PSP                     ENDS            ;program segment prefix


(Note that this official definition leaves out one important detail: the first byte of the PspCommandTail is actually a count of bytes equal to the length of the string which follows.) But the part you are probably most interested in is the PspNextParagraph member which points to the top of the programs memory block. Here is what the Programmer's Reference has to say about it:

PspNextParagraph
Specifies the segment address of the first paragraph immediately following the program. (This address does not point to the free memory available for the program to use.) Programs use this field to determine quickly whether they were allocated sufficient memory to run successfully


Thus, the memory block containing the program is typically much bigger than 64KB and can be computed by: (PSP.PspNextParagraph - DS) * 16.

DOS actually allocates two memory blocks for each program: one for the program and one for a copy of all the  environment variables. The PspEnvironment member points to this other memory block which contains all the ASCIIZ environment variables placed in series and terminated with a zero length string.

There is no need to use the memory allocation functions unless you plan on giving some of the memory back. Regarding the DOS Int 21 memory allocation routines, there are three that are useful:

  • Function 48h - Allocate Memory Call params: AH = 48h, BX = paragraphs (16 bytes each) to be allocated. Return value: AX = segment address of newly allocated block.

  • Function 49h - Free Allocated Memory Call params: AH = 49h, ES - segment address of block to be freed. Return value: None.  (CF clear if successful.)

  • Function 4Ah - Set Memory Block Size Call params: AH = 4Ah, BX = new requested block size in paragraphs, ES - segment of block to be modified. Return value: None. (CF clear if successful.)



In all cases, each of these functions return with the carry flag bit clear if it was successful and set if there was an error. If there was an error, then AX contains the error code. If function 48h or 4Ah fails to allocate the requested size, then BX is returned with the largest available size.

So here is an example snippet (taken from Ray Duncan's book - see below) of a well-behaved .COM program written in ASM which correctly adjusts its stack and reduces its memory block size:

        org     100h
main    proc    far             ;entry point from DOS (ES => PSP)
       mov     sp, OFFSET stk  ;adjust stack pointer
       mov     ah,4ah          ;func 4Ah - Modify Memory Block
       mov     bx,400h         ;400h paras == 16,384 bytes
       int     21h             ;call DOS function 4Ah
       jc      error           ;jump if function failed
       .                       ;our memblk is now 16KB
       .
error:  .                       ;handle error
       .
       dw      256 dup (?)     ;256 byte stack
stk     equ     $               ;base of new stack
       .
       .




For low level DOS programming, I would highly recommend the following two Microsoft Press books (used versions are available for mere pennies + shipping):
MS-DOS Programmer's Reference
Advanced MS-DOS Programming By Ray Duncan
Posted on 2010-10-12 16:16:35 by ridgerunner
Okay. That's kind of what I had assumed, but it's good to get all the details. Thanks for the help!
Posted on 2010-10-12 16:22:39 by commodorejohn
commodorejohn,

There is also word at PSP:6, it contains number of bytes available in segment for .Com program. Usually it contains value around 0xFF00, but when .Com is loaded in UMA, it depends on actual memory block size.
Posted on 2010-10-13 04:42:49 by baldr