Heya everybody... nice to see you... :)

so... its been a while now that i am coding ASM but i still cant have a clean idea of how to work... so...

do 16bit and 32bit ASM have any differences on their codes besides the extended registers? for example... in 16bit you have to write "main PROC.... END PROC" but in 32bit you dont have to... right? anything else?

there are many assemblers... which one to choose? every assembler has its own structure of the code and these structures are different in both Windows and Unix... right?

can you also suggest any good papers, tutorials that can help me get a good idea of the game?

thank you in advance fellas...
cee ya
Posted on 2009-07-29 17:17:55 by stakisko
Other than the extended registers, your primary differences in 32-bit assembly are the fact that you will be using a different memory model and instead of directly accessing system routines through the BIOS and operating system IDT you will be making use of the operating system's shared APIs.

For 32-bit windows applications, the memory model you will need to become accustomed to is the FLAT memory model. The FLAT memory model provides you with linear addressing which allows you, the programmer, access to all memory within the processes scope. The operating system will allocate your process a section of linear memory in which all data, code, etc is written into. When addressing data you use either an address relative to your current location or an absolute address relative to 0 (the beginning of your process' allocated memory). This makes things a lot easier because there is no need to deal with segmentation.

32-bit operating systems make use of ringX levels, ie ring0 to ring3. Your process will be running in ring3 or "user mode" which means you don't have access to the underlying IDT anymore, everything you do needs to go through the operating system. Because of this, 32-bit operating systems export procedures which your ring3 process has access to. The "user mode" application calls these procedures, the procedures call system service routines which then execute in kernel mode (ring0) before returning the results back down the call stack to your application. Some APIs preform simple routines which don't require BIOS calls but simplify such things as converting ASCIIZ text to UNICODE wide format. When you begin programming under 32-bit operating systems it's a pretty good idea to get a copy of that operating systems API documentation for quick reference to these procedures as you will be using them ALOT. Windows 32-bit operating systems make use of the Win32API which is documented at msdn.microsoft.com, also a download to the HTML Help version is at the bottom of this reply. The GNU/Linux and BSD systems make use of the Standard C Library. These *nix systems are unique in that they DO allow you access to the IDT, allowing you to preform system calls through `INT 80H'; where EAX=Syscall Function. Depending on your operating system you'll find it's best to try and conform to the shared objects supplied to you, at least until you find a good reason to be accessing the low level interrupts.

Here are two examples to help you get a visual of 32-bit programming. I'm not writing a tutorial here so it's on you to research and figure out what each API routine does.

.386
.MODEL FLAT, STDCALL
OPTION CASEMAP: NONE

; --------------------------------------------------
; Filename: WinAsmDemo.asm
; Developer: Bryant Keller
; Date: July 29, 2009
; Purpose: This is a JWASM example of writing
; 32-bit assembly on a Windows OS.
; jwasm -coff WinAsmDemo.asm
; golink WinAsmDemo.obj
; --------------------------------------------------

MessageBoxA PROTO :DWORD, :DWORD, :DWORD, :DWORD
ExitProcess PROTO :DWORD

NULL Equ 0
MB_OK Equ 0
MB_ICONINFORMATION Equ 40h

.DATA
strTitle BYTE 'Win32ASM Demo', 0
strMessage BYTE 'This is a simple example of programming '
BYTE 'on Windows in JWASM', 0

.CODE
MsgBox PROC lpstrMessage:DWORD, lpstrTitle:DWORD
PUSH MB_OK + MB_ICONINFORMATION
PUSH lpstrTitle
PUSH lpstrMessage
PUSH NULL
CALL MessageBoxA
XOR EAX, EAX
RET
MsgBox ENDP

Start PROC
INVOKE MsgBox, Addr strMessage, Addr strTitle
INVOKE ExitProcess, 0
Start ENDP

END Start


.386
.MODEL FLAT, SYSCALL
OPTION CASEMAP: NONE

; --------------------------------------------------
; Filename: LinAsmDemo.asm
; Developer: Bryant Keller
; Date: July 29, 2009
; Purpose: This is a JWASM example of writing
; 32-bit assembly on a *NIX OS.
; Build: jwasm -elf LinAsmDemo.asm
; gcc -s -nostartfiles -o LinAsmDemo LinAsmDemo.o
; --------------------------------------------------

puts PROTO :DWORD
exit PROTO :DWORD

.DATA
strMessage BYTE 'This is a simple example of programming '
BYTE 'on GNU/Linux in JWASM', 0

.CODE
_start PROC
INVOKE puts, Addr strMessage
INVOKE exit, 0
_start ENDP

END _start


Related Links:

  • Win32 API Help Files - http://www.carabez.com/downloads/win32api_big.zip

  • GNU/Linux Online Manual Pages (full api list included) - http://www.die.net

  • GNU C Library Reference Manual - http://www.gnu.org/s/libc/manual/html_node/index.html

  • Intel Manuals (explains the IA64/32-bit architecture) - http://www.intel.com/products/processor/manuals

  • Microsoft Developer's Network - http://msdn.microsoft.com/en-us/default.aspx



I hope this helps you get started. Also, for your mention of assemblers. Just try a few out and pick the one you like. Despite what fanatics will tell you, there is no "best" assembler. Each one has its own good points and bad points. I like the MASM/POASM/JWASM style assemblers for quickly drudging up demo code and testing out routines where I really don't care about optimizing the whole application and I'm willing to let the assembler handle all the stuff that isn't important at that time. I like assemblers like NASM/GOASM for creating production code because they don't assume too much with your code and preform a lot of 1 to 1 translation. With me, it's more about what's supported where. When I'm on Windows OS's I'll tend to use POASM and GoASM. PoASM supports the HL syntax I like for prototyping code and GoASM, although the macro engine isn't the greatest, has a lot of shortcuts that NASM just doesn't have (like D,W,B instead of DWORD, WORD, and BYTE and anonymous labels). On GNU/Linux I use JWASM for prototyping as it's the only one of the "HL Assemblers" which supports Linux and doesn't demolish the language. For any production code on Linux/BSD (or anything else that supports IA16/32/64, other than Windows) I use NASM due to it's powerful macro engine while still supporting 1 to 1 translation.

You'll see a lot of "Assembler Wars" start up every now and then, but to be honest it's all a matter of preference and it doesn't really matter what you use as long as you are comfortable with using it. If you like the HL style assemblers by all means you should use it. It's better to write software using a tool you are comfortable with than to try and be "cool" and use one which confuses you. Truth is, unless you are comfortable using the 1 to 1 style assemblers then you aren't going to be able to write better optimizations than the HL ones can generate anyways, so what's the purpose of using it. But that's enough rambling out of me. :lol:

Regards,
Bryant Keller
Posted on 2009-07-29 23:21:46 by Synfire
can you also suggest any good papers, tutorials that can help me get a good idea of the game?

First you should try to comprehend the examples provided by Synfire. As soon as you understand what is going on in these you'll start to get a hang of it.
Posted on 2009-07-30 00:06:01 by ti_mo_n
thanks a lot guys... bryant you where too helpfull...

so... i personally like coding with nasm under both Windows and *nix systems.

i will start right now with your links... also a win32 API documentation i have allready read was www.winprog.org/tutorial. its using C++ to access the API but its a good start...

i hope this to be a good start for me...

thanks again everyone... cee ya around felloas
Posted on 2009-07-30 04:40:26 by stakisko
What a great answer you gave, Synfire  ;)
Posted on 2009-07-30 11:54:14 by ChaperonNoir
stakisko,

Make sure you get the Win32 API help files I suggested. Just about everyone on the board probably already has these and I know when I first moved over from GNU/Linux to Win32 they practically saved my life.  :lol:

The API references I posted use C instead of C++ which is a good thing. I haven't checked out your link yet, but if it uses C++ there is the possibility of it listing MFC or ATL code which isn't part of the API per say, it's part of the PSDK. A lot of C++ documents also reference wrappers to API's instead of the API's themselves which is always a "joy" to spend hours trying to find out what DLL some procedure you referenced is in only to find out that it's a C++ wrapper for a procedure of a similar name only more arguments. :mad:

For NASM, although I'm a bit partial to this, I would suggest you try out NASMX. The INVOKE macro included in it supports on-call importing of API routines which means you don't have to worry with hunting down what procedure is in what DLL.

Also, if you decide not to use NASMX, you'll need to do your own name mangling for imported procedures. As I'm sure you know, NASM has two directives which can be used for imported routines; IMPORT and EXTERN. Depending on which one you use the naming convention will be different. Calling convention also plays a part in name mangling... I really suggest you use NASMX or at least check out the macros included with it. NASM does nothing to help us with name mangling since it's a cross platform assembler, and if you aren't familiar with the conventions it gets confusing sometimes (also a bit of a pain to have to write __imp_MessageBoxA@16 each time you do a call).

I personally suggest you use the EXTERN directive over the IMPORT directive. IMPORT seems like the better choice as it was explicitly designed for importing API's but honestly it's not. I remember back when I did the early releases of NASM32 I had the backend importing routines through IMPORT and all my test builds were done with the `-fobj' option due to the use of ALINK as my linker. Unfortunately, after release a bug reared it's ugly little head when people started trying to link their object code using MS-LINK and Jeremy's GoLINK. It turned out that IMPORT was, at that time at least, designed with OMF/OBJ in mind and when you tried to build using COFF, the `-fwin32', build option for linking with other linkers the whole system would fail because the imports would never be found. Since then I changed everything over to EXTERN and it started working with OMF/OBJ and MS/COFF without problem. It was a real big issue in the early releases that nearly killed off the project, but from the sound of it you are kinda like me and do the cross-platform code a lot so I would definitely suggest you stick with EXTERN.

I didn't even notice you ask for tutorials/documents.. right off the top of my head I can't think of anything. If you give me some time, I've currently got an essay to finish for my Humanities class and I have to place an order for a NIC and two printers before I leave for work in the morning, I'll look on my netbook's bookmarks for any NASM related links I can find.

~Bryant
Posted on 2009-07-31 02:01:58 by Synfire
Hi everyone, I'm new here.
"PROC.... END PROC" I believe these are assembler directives and
are used in both 16-bit and 32-bit assembly, am I right?
16-bit and 32-bit instructions differ in their instruction format.
Registers are 32-bit wide in 32-bit assembly, the segment registers FS and GS are new
but accessible in real mode, but in real mode FS and GS cannot be used
to index into a segment.
Real mode is the mode of the processor right after reset, in this mode
the processor is like a 8086.
In 16-bit assembly physical addresses are formed by bitshifting
the value held in a segment register to the left by four bits
to form a 20-bit physical address corresponding to twenty addressing pins,
limiting the amount of physical address space to 1MB, excluding
the possibility of the A20 line (21st addressing pin found in 80286+).
Since offsets are 16-bits in 16-bit assembly, segments are 64KB in length (max).
In 32-bit assembly segment registers hold selectors which act as an index into
a table of descriptors (in memory), these descriptors hold the linear address
for segments,
segmentation unit->linear address->paging unit->physical address
if I'm right, I'm still learning 32-bit assembly.
Offsets are 32-bits in 32-bit assembly hence segments are 4GB in length (max).
When a flat memory model is chosen one segment is used to map
the entire linear address space up to 4GB, segment registers still contain
a selector.
---------
Nice to see an assembly site with alot of activity, I mainly toy 8086/8088 assembler.
Posted on 2009-07-31 06:22:03 by 3y3ty
stakisko, here are the links I was able to dig up...

Paul Carter's Tutorial - http://www.drpaulcarter.com/pcasm
LinuxAssembly - http://www.linuxassembly.org
DeinMeister's Win32Asm Tutorial - http://www.deinmeister.de/wasmtute.htm
A Super Simple NASM Tutorial - http://www.cin.ufpe.br/~if817/arquivos/asmtut/index.html
AMD Developer's Guides - http://developer.amd.com/documentation/guides/Pages/default.aspx
Flat Assembler Documents - http://flatassembler.net/docs.php
Agner's Optimization Reference - http://www.agner.org/optimize
WizKid's BSD/NASM Tutorial - http://www.int80h.org
MIT's x86 Files - ftp://rtfm.mit.edu/pub/usenet/news.answers/assembly-language/x86/
Microsoft Macro Assembler Reference - http://msdn.microsoft.com/en-us/library/afzk3475(VS.71).aspx
SmallCode Tutorials - http://www.strchr.com
Jeremy Gordon's Tutorials - http://www.jorgon.freeserve.co.uk/#tutorials

I'll look around for more later on man, but this should get you started.


Hi everyone, I'm new here.


Welcome to the forum 3y3ty :)


"PROC.... END PROC" I believe these are assembler directives and
are used in both 16-bit and 32-bit assembly, am I right?


Depending on the assembler you use, correct. Many assemblers don't support PROC/ENDP. That is natively, many developers take it upon themselves to create PROC/ENDP macros which simulate the built-in directives of other assemblers to ease programming.


16-bit and 32-bit instructions differ in their instruction format.
Registers are 32-bit wide in 32-bit assembly, the segment registers FS and GS are new
but accessible in real mode, but in real mode FS and GS cannot be used
to index into a segment.


Actually FS and GS have been supported in 16-bit since the 80386, so they aren't very new and they are certainly not restricted to 32-bit assembly. His earlier post suggested that already understood about registers, and I negated going into any great detail on segment registers as they aren't used as much in 32-bit systems. Only one I can think of atm is the FS register for setting up exception handling and accessing those, oh so wonderful undocumented windows structures.


In 16-bit assembly physical addresses are formed by bitshifting
the value held in a segment register to the left by four bits
to form a 20-bit physical address corresponding to twenty addressing pins,
limiting the amount of physical address space to 1MB, excluding
the possibility of the A20 line (21st addressing pin found in 80286+).


Your half right, you shift the segment register by 4 bits then you also have to add the offset of memory you are addressing. Of course nobody really needs to deal with this since in 32-bit mode you are almost always running in the flat memory model which uses a contagious chunk rather than segmented areas of memory.


Nice to see an assembly site with alot of activity, I mainly toy 8086/8088 assembler.


And it's nice to have some new blood here. Your 8086/8088 roots are definitely showing as it's rare to see any discussion of segmented addressing or really any 16-bit stuff at all. I can't speak for everyone but I think many people moved on from 16-bit programming a long time ago and the memories of coding TSRs and fighting over upper memory is just too much to bare. :lol:

~Bryant
Posted on 2009-07-31 17:34:34 by Synfire
Don't worry, I understand that a 16-bit offset
is added to the resulting 20-bit physical address
under the 8086.
How else would you index into a segment?
Hence if SegReg=1211h with an offset
of 0011h would form a physical address of:
12110h
+0011h
-------
12121h physical address.

But I would like to get into the protected mode stuff.
I want to use win32 assembly to get familiar with
32-bit programming,
and study the 80386 programmer's reference manual
to get familiar with the systems programming part.

I'm sure this is the site to get help on this.

Does anyone know where I can get a descriptive
introduction to protected mode descriptor format,
memory protection and paging???
Posted on 2009-07-31 18:47:37 by 3y3ty

Does anyone know where I can get a descriptive
introduction to protected mode descriptor format,
memory protection and paging???


http://wiki.osdev.org/Protected_mode
Posted on 2009-07-31 18:54:31 by SpooK
where I can get a descriptive
introduction to protected mode descriptor format,
memory protection and paging???

Intel® 64 and IA-32 Architectures Software Developer's Manuals
They are VERY formal and VERY detailed, so I suggest you look at the link provided by Spook first.

But know that you DON'T need to know how protected mode works in order to code apps for Windows and other 32-bit proteced mode OSes. More than 99% of programmers don't really know what a protected mode really is ^^'
Posted on 2009-07-31 19:53:50 by ti_mo_n
Thats right, we just code userland apps, under the PSDK and api !
Posted on 2009-08-01 05:28:05 by Homer


Also, for your mention of assemblers. Just try a few out and pick the one you like. Despite what fanatics will tell you, there is no "best" assembler. Each one has its own good points and bad points. I like the MASM/POASM/JWASM style assemblers for quickly drudging up demo code and testing out routines where I really don't care about optimizing the whole application and I'm willing to let the assembler handle all the stuff that isn't important at that time. I like assemblers like NASM/GOASM for creating production code because they don't assume too much with your code and preform a lot of 1 to 1 translation. With me, it's more about what's supported where. When I'm on Windows OS's I'll tend to use POASM and GoASM. PoASM supports the HL syntax I like for prototyping code and GoASM, although the macro engine isn't the greatest, has a lot of shortcuts that NASM just doesn't have (like D,W,B instead of DWORD, WORD, and BYTE and anonymous labels). On GNU/Linux I use JWASM for prototyping as it's the only one of the "HL Assemblers" which supports Linux and doesn't demolish the language. For any production code on Linux/BSD (or anything else that supports IA16/32/64, other than Windows) I use NASM due to it's powerful macro engine while still supporting 1 to 1 translation.

You'll see a lot of "Assembler Wars" start up every now and then, but to be honest it's all a matter of preference and it doesn't really matter what you use as long as you are comfortable with using it. If you like the HL style assemblers by all means you should use it. It's better to write software using a tool you are comfortable with than to try and be "cool" and use one which confuses you. Truth is, unless you are comfortable using the 1 to 1 style assemblers then you aren't going to be able to write better optimizations than the HL ones can generate anyways, so what's the purpose of using it. But that's enough rambling out of me. :lol:



That is a nice summary of various assemblers when writing code directly in assembly. Are all the assemblers you mention above ones that have been around for a long time and will be around for a long time? Are they all used as mission critical pieces of some large projects that just cannot afford any of these particular assemblers to die?

What about when writing a compiler with assembly as the target language? I'd think the HL assemblers are not good targets for that and an assembler like NASM would be a better target choice, yes?

Peter
Posted on 2009-08-01 15:39:32 by petermichaux
When you want your compiler to provide a perfect assembly listing that the user can assemble by hand, you need to target an assembler that lets you manipulate the object module very precisely. When I'm talking about manipulating the object structure precisely, I'm talking about special directives like SEGMENT and PUBLIC. If you take a look at any assembly listing , you'll see what I mean.

A lot of assemblers are suitable for this but not all of them can do it (RosAsm is nice for learning assembly but you don't have this kind of control)
Posted on 2009-08-01 20:43:21 by ChaperonNoir

That is a nice summary of various assemblers when writing code directly in assembly.

Thank you :)


Are all the assemblers you mention above ones that have been around for a long time and will be around for a long time?



MASM and NASM have both been around a very long time, and PoASM is part of the PellesC toolkit which was based on the original LCC compiler so you could kinda say it's a restoration of an old assembler (although it far surpasses LCC's assembler and I've never had issues with it generating code that I didn't expect). GoASM is relatively new, but it's VERY stable. JWASM is very new, I wouldn't really trust the HL directives except in certain cases, like the .IF/.ELSE/etc.. which simply creates CMP's.


Are they all used as mission critical pieces of some large projects that just cannot afford any of these particular assemblers to die?


Honestly, if you are doing mission critical or safety critical code you should be using the assembler which is suggested by the manufacturer of the device you are working with. MASM and NASM are both pretty decent for M/C and S/C development as they have had years of testing behind them and are recognized by industry leaders.


What about when writing a compiler with assembly as the target language? I'd think the HL assemblers are not good targets for that and an assembler like NASM would be a better target choice, yes?


Not true at all. MASM is one of those such HL assemblers and it's been the backbone of Visual C/C++ since it's earliest releases. PoASM is also an assembler which was originally created as a backbone of a compiler. Yes, I'm sure the LL ones are probably easier to work with as far as code generation goes (in fact I think PellesC actually uses a NASM style syntax which gets passed through an internal assembler now, leaving PoASM only for backwards compatibility). But to say that they aren't suited for it is definitely an overstatement. In fact, JWASM is an open source port of Open Watcomm which also has it's roots as a C compiler backend. ;)

~Bryant
Posted on 2009-08-02 02:19:23 by Synfire
Hehe, I dunno about "MASM and NASM have both been around a very long time" :)
MASM is THE x86 assembler, it has been around since the early days of DOS. It was pretty much the first widely available assembler for x86.
NASM has only been around since the late 90s I believe.
So that's quite a difference there, to me MASM has always been around, but NASM still feels as a newcomer, because I can still recall when I first heard of it...
Not that it matters today though. Most of MASM's DOS heritage is irrelevant, as DOS has become obsolete years ago. NASM has become a very popular assembler in its own right, and being an opensource solution, it will likely be around for a long time (I believe the original developers have long abandoned it, and others have taken over. One of the original developers was Simon Tatham btw, of Putty fame).
Posted on 2009-08-03 04:08:00 by Scali
NASM had it's first widely available release in '91, which is a pretty long time compared to assemblers like PoASM and GoASM. You are right though in the grand scheme of things, compared to assemblers like MASM and A86 it's just a baby.

EDIT: Correction according to the revision documents it was 1996.. funny I could have swore it came out before then, apparently I was wrong.  :lol:
Posted on 2009-08-03 07:25:28 by Synfire
Yea... I find that quite funny actually... I mean, in the 80s to early 90s, there were only a handful of x86 assemblers commonly used. Mainly MASM and TASM... And back then, assembly was actually still quite 'mainstream'... By the mid-90s, most programmers no longer used assembly on a daily basis... in fact, by then you had the first generation of programmers who had never used assembly at all. Yet from then on you see new assemblers popping up left and right.
Bit ironic really... the less people use assembly, the more assemblers are being offered.
Posted on 2009-08-04 10:13:10 by Scali