Assembly Language Syntax

From ASM Book

Jump to: navigation, search

This chapter describes the syntax and the grammar of the 80x86 assembly language.

Contents

General Outline

Lexical Issues
  Whitespace
  Comments
  Identifiers
  Literals
     Integers
     Characters
       ASCII
       Unicode
       DBCS
     Strings
       ASCII
       Unicode
       Types
         Null-Terminated (Zero-Terminated)
         Dollar-Terminated
         Length-Prefixed
         Descriptor-based
         Mixed-mode HLA Strings
     etc...
  Keywords
  Separators
Instruction Syntax
  General Instruction Syntax
  Operands
     Registers
     Memory variables
     Literals
     Expressions
Labels, Variables and Data Definition
  Data Definition and Types
    Simple Types
      BYTE or DB
      WORD or DW
      DWORD or DD
      QWORD
      etc...
    Packed Data Types
      BCD
      etc...
Operators
  Separators
    Comma (,)
    Period (.)
    Line-Extension (\)
  Arithmetic
    *
    /
    +
    -
    MOD
  Bitwise
    Bitwise Logical
      AND
      OR
      NOT
      XOR
      etc..
    Bit Manipulation
      SHR
      SHL
      etc..
  Relational
    ==
    <
    >
    !=
    !
    =>
    <=
  Grouping
    Parentheses ()
    Brackets []
    Braces {}
    Quotes "", ''
    Angled-Brackets <>
    etc...
  Assignment
    =
    EQU
    :=
  Special
    ?
    etc...
Assembler Directives
  (TODO)
Layout and Style
  Code
    Traditional Linear
    Indented
    Mixed
  Comments
    Procedure Details
    Line Details
    Single-Line
    Multi-line
  Labels

Instruction Syntax

There are many kinds of assembly language that differ from one another in many ways. We will be using the Intel Architecture 32-bit (IA-32) assembly language syntax throughout. The assembly syntax of various 80x86 assemblers may vary somewhat but all of them are essentially subsets of the Intel Architecture assembly language. So, whenever we refer to assembly language, in general, it will mean we are referring to the IA-32 syntax. An instruction in the IA-32 syntax format looks like this:

  label:   mnemonic   operand1, operand2, operand3   ; Comment

Example:

  mylabel:     mov  eax, 01           ; Copies 01 into the eax register.

where "MOV" is the mnemonic and "MOV EAX, 01" is the instruction. So, whenever we say MOV instruction, we are referring to the complete statement and not just "MOV" itself.

To keep the syntax above clean and simple, we have not used any special markup to indicate optional components, but since it is necessary to do so, we provide another version of the above syntax to make things clearer. Remember, only in syntax definitions like this one, we will be using curly braces to mark out optional parts.

  {label:} mnemonic {operand1} {, operand2} {, operand3}   {; Comment}

MASM Specific

Microsoft Macro Assembler (MASM) follows this syntax closely, so all our examples that are not marked as specific to any assembler otherwise, will mean they are for MASM.

HLA Specfic

  label:   mnemonic(   operand1, operand2, operand3 );   // Comment

An Example

 mylabel:     mov( 01,  eax );           // Copies 01 into the eax register.

HLA's syntax is quite a bit different than MASM's. In general, you'll find that the operands are reversed (that is, the source operand is first and the destination operand is second, opposite MASM's (dest,src) organization).

Also note that HLA uses a functional syntax for instructions, treating operands as though they were parameters to a function that does the operation.

HLA supports an interesting feature known as instruction composition. This allows you to specify one instruction as an operand of another, e.g.,

  mov( mov( 5, ebx),  eax );           // Copies 5 into the ebx and eax registers.

Whenever an instruction appears as the operand to another, HLA will emit the interior instruction first and then substitute the destination (second) operand in place of the interior instruction when processing the outer instruction. Generally, you won't find instruction composition used as in this example, but it is quite useful when expanding macros, when calling procedures, and in high-level control structures, e.g.,

  if( mov( i, eax ) ) then      // "true" if "i" contains a non-zero value.
  <>
  endif;

Though HLA fully supports labels like any other assembler. The use of high-level control structures usually obviates the need for such labels in actual source code. Nevertheless, those who prefer to eschew the high-level control structures and write "low-level" assembly code, labels use the same basic syntax in HLA as in other assemblers.


FASM Specific

(todo...)


GoASM Specific

(todo...)


NASM Specific

(todo...)

Personal tools