Could you explain this statement (in Protected Mode Memory section) from asmintro.chm of masm32 to me and give some code examples  :D :"You can also get into trouble if you incorrectly dereference a variable in a register and try and read or write to that address as it will be out of the address range that your program has access to." ?
I don't know what "a variable in a register" is either, could you explain this phrase to me?

Here's an online copy for your quick view of this document:
Masm32:Flat memory Model
http://www.feiesoft.com/masm32/asmintro/flatmemorymodel.html
Posted on 2011-10-11 12:13:36 by bolzano_1989
Programs in userland use Virtual memory addresses, not the real Absolute ones that protected mode uses, you will need to figure out how to convert between them.
To make this more clear, most applications appear to execute at the same memory address (40000h or whatever), and most DLLs similarly appear to be mapped to a base address of 100000h or something like that - obviously, this can't really be the case, as one application would overwrite another one - these addresses are not real, they are a fiction of the operating system. When a file is 'mapped' into memory, it is allocated a virtual address space, relative to the process who opened it - and similarly, when a Process is created, it is allocated some virtual 'mapped' memory by the system. This implies that the system's memory page manager contains a big fat list of these associations, a map of real to virtual address associations, heh.

Oh - and a variable in a register? just means you used a register to read some variable (say a pointer) and then tried to access that pointed-at memory via the register you loaded.
eg
mov eax,0040000h
mov edx, <- ouch in pmode

Posted on 2011-10-12 04:21:01 by Homer
Thank you Homer, now I know that in mov edx, is called "a variable in a register" and such usage is called "dereference a variable in a register" :D.
Posted on 2011-10-12 12:45:57 by bolzano_1989
Disclaimer: I don't "do Windows" and am a "devout Nasmist", so I'm not familiar with Masm32!

Quoting Homer:

mov eax,0040000h
mov edx, <- ouch in pmode

Really? What's wrong with that? I would expect it to return "MZ" and a couple other bytes. No? In Linux, I'd have to use a different address - 8048000h - but I'd get back 7Fh, 'E', 'L', 'F' in edx (little-endian, of course)...

The page Balzano_1989 references says all segment registers are "set to the same value". I don't think this is strictly speaking correct. Is cs = ds in Windows? It isn't in Linux - cs=23h, ds=2Bh (according to my debugger). It is true that the "base" of the segment descriptor pointed to by both selectors is zero, so it amounts to the same thing, but "strictly speaking"(!), they are not "the same value". Also, I think fs is "thread local storage", but that's a different issue...

To illustrate what I understand by "dereference a variable", I offer a stupid mistake I made trying to help "anders11" in a nearby topic. I had a "count" variable...

count: .long 0

in (G)as syntax. Nasm would be:

count dd 0

(I think Masm would be the same... or maybe "dword" rather than "dd). I had filled this variable with, say, 5. Now I call a suboutine, something like:

pushl $count
call mything

In Nasm syntax:

push count
call mything

In Masm(?):

push offset count
call mything

In my subroutine, I did something like:

; prolog
movl 8(%ebp), %ecx

or Nasm:

; prolog
mov ecx,

Masm(?)

; prolog
mov ecx, 8

(I think the Nasm syntax would also work in Masm?)
Then I tried to use ecx as a loop counter. RONG! I had the address of "count" in ecx, not the "". Loops a few too many times! What I had to do, was "dereference the variable":
Gas:

movl (%ecx), %ecx

Nasm/Masm:

mov ecx,

(yes, it's okay to use the same register)

Of course, it would have been smarter to pass the contents, rather than the address in the first place!
Gas:

pushl count

Nasm:

push dword

Masm

push dword ptr count

?

This is, as I understand it, the difference between "passing by reference" and "passing by value". I'm not certain I've got the terminology right, but I'm pretty sure I've got the concept right. The distinction between "address" (offset) and "contents" of that address is an important one for beginners to "get" (and anyone to remember :) ). Unfortunately, this is an area where different assemblers differ greatly in syntax!

Please correct my Masm syntax, and the "concept", too, if you think I've got it wrong!

Best,
Frank





Posted on 2011-10-12 19:25:57 by fbkotler
Just need to see 'physical to virtual addressing' to see what I meant ... protected mode uses segment selectors to map 'linear addresses' to 'physical addresses' - but this is only the beginning of the story, its just how kernel drivers view the world (under windows os), since userland applications live behind a mystical curtain called ring 3, and use another layer of memory address virtualization where physical addresses are mapped to virtual (process space) addresses. The examples I gave were virtual addresses, so it's anyone's guess what you would find at physical address 00400000h, let alone at the same address in linear memory!
Posted on 2011-10-13 02:43:56 by Homer
Ah, you meant 40000h physical. No, we couldn't see that from "userland". I'm used to (resigned to) thinking of virtual addresses as "all there is". Who knows what physical address that might be? Different for each process, as you point out. Might even be swapped out to disk, and not be in RAM at all! Fortunately(?), the OS takes care of it for us, and we don't need to know/care... unless we're writing an OS, in which case we would need to know all the ugly(?) details.

To further complicate the issue, some memory is "readonly". In Linux:

mov eax, 8048000h
mov edx,   ; gets 7F 'E' 'L' 'F'
mov , edx  ; BZZZZT! - readonly!


8048000h is the "beginning of the world" - an attempt to read/write at a lower address than that would fail. Our "data segment" starts at 8049xxxh (or higher, depending on the size of the code segment) - there we can write. Above that, there's the "break" - we can't read/write above that... although we can alter it to allocate more memory. The stack is higher than that - 0C0000000h, and working down. The numbers would be different in Windows, but I think the idea's the same. If we use "named variables", we don't need to know what the numbers are, so it probably isn't useful to discuss. Sorry.

Best,
Frank

Posted on 2011-10-13 11:10:10 by fbkotler