I've been trying to learn ASM for a while now, but I'm finding it very difficult to find a proper tutorial. The thing is, all the different assembler stuff is really confusing to me. What I've gathered so far is that there are several different assemblers, such as MASM, FASM, NASM, and others, and that they all have their own syntax. I also know that MASM seems to be the most popular/widely used assembler, so I started reading Iczelion's Win32 ASM Tutorial. The thing is, there are terms that I don't know, which means to me, that I'm missing some information. I remember reading part of The Art of Exploitation, and there was some Assembly in there, but it didn't at all look like the same syntax as MASM, and it used terms such as eax, push, etc. I took a look at the ASM Book, but I couldn't quite understand it's organization. So my question is, where do I start, if I want to learn Assembly?
Posted on 2009-12-21 13:08:28 by grey88
Iczelion's tutorials, of course ^^ They are great for starters. And if you have any questions then just ask :)
Posted on 2009-12-21 15:28:48 by ti_mo_n
To expand on ti_mo_n's suggestion, you'll want to use Iczelion's in conjunction with the knowledge of a higher level language, such as C. This should help teach you about usage, variables, labels and such... bridging the knowledge gap.

To learn more about assembly language itself, I would recommend grabbing AoA (Art of Assembly) 16-bit version. There will be some things you will learn that won't apply in the 32/64-bit world, but most of the information is useful and still applicable.
Posted on 2009-12-21 17:12:07 by SpooK

To expand on ti_mo_n's suggestion, you'll want to use Iczelion's in conjunction with the knowledge of a higher level language, such as C. This should help teach you about usage, variables, labels and such... bridging the knowledge gap.

To learn more about assembly language itself, I would recommend grabbing AoA (Art of Assembly) 16-bit version. There will be some things you will learn that won't apply in the 32/64-bit world, but most of the information is useful and still applicable.


Thanks. Are there any other books you would recommend I read to learn some of the more relevant stuff, too? Also, I am in the process of learning C too. Apparently C++ is only really good for OOP which is really only needed for large scale projects.
Posted on 2009-12-21 19:35:57 by grey88
Sorry to double-post, but I also remember finding that HLA is not considered "real" Assembly... what exactly is the difference between HLA and just regular Assembly?
Posted on 2009-12-21 19:37:31 by grey88

Sorry to double-post, but I also remember finding that HLA is not considered "real" Assembly... what exactly is the difference between HLA and just regular Assembly?


HLA is a mixture of traditional assembly language and high-level constructs. It is meant as a learning tool to bridge the gap between high-level languages (C, C++, etc...) and low-level languages (assembly language, etc...) but in my experience users tend to be more confused and less informed about assembly language when using HLA as a crutch. If you were a CS student interested in merely "getting through your intro to assembly language", then HLA would be OK. If you want to really learn assembly language, HLA won't be too much use for you as you will get caught up in "HLAisms" instead of learning assembly language.

As for "real" assembly language, that is traditionally any assembly language that has a near 1:1 representation with processor instructions, i.e. low-level. Every line of code is traceable and doesn't involve higher-level interpretation. Now, most modern assemblers are macro assemblers, which allow for certain high-level constructs called macros. Macros allow for higher-level interpretation because, let's face it, you don't necessarily want to repeat yourself if you don't have to.

As for other books, most decent material can be found on the web... for free. A good site with ASM tutorials is http://www.drpaulcarter.com/pcasm/, albeit NASM-centric.
Posted on 2009-12-21 20:47:02 by SpooK
Thanks again. I'm definately in the "learn Assembly for Assembly" group. Is there anything I should be aware of while I'm reading this other tutorial? Considering that it's NASM centric...
Posted on 2009-12-21 22:17:43 by grey88

Is there anything I should be aware of while I'm reading this other tutorial? Considering that it's NASM centric...


Not really. Anything you can do in NASM usually has an equivalent convention in another assembler. Also, NASM is an offshoot of Intel-style syntax, so there is minimal effort in moving to MASM/FASM/YASM/TASM. GAS/AT&T is a whole other beast altogether.
Posted on 2009-12-21 22:41:59 by SpooK
The Art of Assembly 16-bit edition is a fine book. I would suggest you read the chapters about general CPU organization, memory addressing and that sort of thing.
The things that are very DOS-related or BIOS-related (eg int 21h functions) aren't very useful as they don't apply to modern 32-bit or 64-bit OSes.

If you are running a 64-bit Windows OS, you cannot use 16-bit DOS code. A solution to that would be to download DOSBOX. This will give you a simulated 16-bit x86 environment, which allows you to run 16-bit DOS code and use BIOS functions etc. This way you could play around with some of the examples in the book.

Alternatively, you could just read the basic 'theory' chapters, and skip the examples... instead using the Iczelion tutorials for some 'hands on' code.
Sadly I don't know of any 'checklist' of differences between 16-bit/32-bit/64-bit... But mainly the addressing modes are quite different. You don't need to bother with segment:offset addresses in 32/64-bit mode. Also, any register can now be used in addresses, where previously ax, cx and dx weren't allowed.
Posted on 2009-12-22 03:16:43 by Scali
One concept that Grey88 (or anybody) will need to keep straight is the difference between an address, and the contents of memory at that address. Unfortunately(?), this is something that different assemblers do in different ways...

Address:

mov eax, foo ; Nasm
mov eax, offset foo ; Masm/Tasm, Gas with ".intel_syntax"
mov $foo, %eax # Gas

Contents:

mov eax, ; Nasm
mov eax, foo ; Masm/Tasm
mov foo, %eax # Gas

There are other differences, but this is one of the most confusing, IMHO.

If there are questions about "terms" or "missing information", ask! Someone will probably know...

Best,
Frank


Posted on 2009-12-23 12:29:44 by fbkotler
Contents:

mov eax, foo ; Masm/Tasm

TASM in ideal mode requires mov eax,
Posted on 2009-12-23 14:32:09 by ti_mo_n
Okay, I stand corrected. As I say... something a beginner will want to be clear about, 'cause different assemblers do it different ways...

Best,
Frank

Posted on 2009-12-24 04:09:23 by fbkotler
This syntax of using [] to denote a reg/mem operation is the defacto standard.
Posted on 2009-12-24 04:22:45 by Homer
Homer,

Doesn't reg/mem stands for "register or memory operand"? Because if operand is register, [] aren't used.

Probably "[] are used when operand is memory reference"? Not a nit-picking, understanding this is crucial for beginner: even Intel SDM in ModR/M table has registers alone in column titled "Effective address". (lea eax, ecx anyone? ;-)
Posted on 2009-12-24 04:37:13 by baldr
Just my 3 cents, if I may:

[] means a dereference and should be -IMHO- forced by all assemblers. "mov eax, var1" doesn't clearly explain if you're loading the address of the variable or a value pointed by that address (and you actually have to learn how a particular assembler understands it, or guess). Forcing the [] would be consistent with how registers work: "mov eax, ebx" means loading the register, while "mov eax, " means loading a value pointed by that register.

In other words, if I see "mov eax, offset var" or "mov eax, " I instantly understand what it means.
Posted on 2009-12-24 10:19:29 by ti_mo_n
There are only two ways to move data.
One is between registers, the other is between a register and a memory address.
I'm pretty sure even beginners learn the difference quickly.
Squabbling about the semantics of a term such as reg/mem is pointless given that the term is not clearly defined.
To me, it implies movement of data between register and memory, although I agree with you if I said R/M then I would be leaving it open to interpretation.
Posted on 2009-12-24 10:38:48 by Homer
There are only two ways to move data.
One is between registers, the other is between a register and a memory address.
<pedantic>And then there's the special issue of PUSH/POP with memory operands</pedantic> :)

I agree with timon that should be enforced on memory access, since it's clearer what's going on.
Posted on 2009-12-24 12:39:23 by f0dder

I agree with timon that should be enforced on memory access, since it's clearer what's going on.


Yep, and there's a really easy way to remember it.

Do I want the box? Or do I want what's inside ?
Posted on 2009-12-24 12:42:17 by SpooK
Another issue ensues: dword ptr or ? eax contains pointer, so second one looks like more readable. Or dword? Don't forget about default addressing segment overrides...

SpooK,

In case of registers, eax means it's contents, not an address of it. ;-)
Posted on 2009-12-24 14:09:44 by baldr

In case of registers, eax means it's contents, not an address of it. ;-)


This was said in the context of f0dder's statement about bracket enforcement for memory access.

It gets to be even more fun, for a beginner, if you conceptualize registers as data objects that get stored into memory. If so, how is DWORD even plausible?

Even better, how does one mov something to the effect of cloning it, wouldn't that be cpy?

If there is anything else we can do to assist in the confusion understanding, please let us know :lol:
Posted on 2009-12-24 14:52:19 by SpooK