hmm i was wondering how do you create an assembler? How do you code binaries? How to compile them?

I have no idea.... anyone please shed light on this one...
Posted on 2001-08-22 00:15:53 by stryker
Step 1: If you have a wife and family get divorced. You won't have time for them.

Step 2: Sell your house and get an apartment the size of a matchbox.

Step 3; Read every book on processors and languages that you can find.

Step 4: Do it.

Conclusion: Some find programming easy and some give up. If you want to be a top notch programmer you have to marry your computer and live in a match box so she will never be out of your site < (lol). And a knack for logic might help also.

I'm sure your question was not as seriouse as this response.

The coder of QIII might have a suggestion or two for you.

Look him up.

You might find him counting the cash he recieves from MicroSoft for allowing them the use of his engine.

Posted on 2001-08-22 01:06:11 by titan
there are some books about compiler design, but with a compiler i (not only i, WE) mean something that translates high level languages into executable trash.

creating an assembler is very easy, you check the instruction name typed in by the user, look it up into a table and make it binary... really easy. ofcourse you assembler has to remember the place of a label so you can calculate the relative distance of a jmp/jcc/call, etc. but that aren't problems at all.

also you have to be able to build a PE exe file that windows like, and thats maybe a bigger problem if you start ;)
Posted on 2001-08-22 06:56:04 by lifewire
very easy...
did u ever tried ??

Posted on 2001-08-22 07:07:15 by (scalp)
i've also hear from someone that's not so complicated.
They say that basically you have to translate instructions in op. codes.
Could it be ?
Posted on 2001-08-22 07:27:43 by Bit7
It wouldn't hurt to look at the source code of some assemblers. SpASM, FASM, NASM, for example - the source code is in SpASM, NASM, C - respectively. I don't recommend copying them, but they do show some of the general machanics that take place in an assembler. The rest of the bells and whistles are up to you. Have fun, it's a great way to learn the instruction set Posted on 2001-08-22 08:35:43 by bitRAKE
But hey! A sick thought is a good start...Ha Ha... Anyway, FYI I don't have a wife nor children...
Posted on 2001-08-22 12:12:04 by stryker
A very basic assembler

An assembler for a simple processor, like one of the Microchip PICs, is relatively easy. All instructions are the same length, and the standard basic assembly language doesn't contain fancy syntax for addressing modes.

All you're left to do is create a simple parser (the only complication is handling expressions), maintain a symbol table for labels and their values, keep track of where the next available location is, and generate absolute (nonrelocatable, nonlinkable) code.

The generated code can be in any format acceptable to a PIC programmer. A number of EPROM programmers can program PICs.

Two passes are sufficient to handle the "forward reference" problem for the PICs. The first pass determines the values of all labels. The second pass evaluates all expressions, and generates code.

A typical restriction on equates (EQU) is that any labels used in the expression are previously defined. That allows the equate label to be defined on the first pass.

Common complications

Conditional assembly, macros, linkable modules.
Each subject is a long discussion.

For the ambitious

The standard Intel, and thus Microsoft, syntax for the x86 has some characteristics that make it especially difficult to handle. Bit format is determined by operand kinds. Some instructions have both a short form and a long form. Some keywords are both instruction names and expression operators (NOT, AND, OR).

And if you want to implement all the high-level features of MASM, it could take a while to work out...
Posted on 2001-08-22 13:17:32 by tank
It all depends on what you call an Assembler. As "lifewire" say (though
i suspect with Scalp that he never tried it...), writing a simple encoder
is simple enough. One one hand, you have a source, on the other hand,
a set of tables that you use to translate instructions in as many
encodings. If the source is supposed perfect and if the syntax has
zero flexibility, you might do this in 2 weeks, if you are clever, and
well know x86 organisation.

Now, the real fact is that this is just NOTHING. You have, too, to hold
all possible errors the user will do, to give some flexibility to the
syntax, to implement a complete Macro Parser. All this is about one
year of work for a simple assembler like MASM or NASM. For a full
featured Assembler, i mean with IDE, Resources Editors, Linker,
Debugger and so on, this is between 3 and 5 full years of work.

Some examples in SpAsm developpement:

- The Dialog Editor Source alone is 120 screens long.

- The errors management has been re-written from scratch 4 times (!!!).

- The Macros/Equates Parser developpement has been close to 1 year long.

- Now Spasm is almost finished (3 years). Estimated time for making it
perfect: 2 years.

Needless to say, for such things, you first need to have enough money
for a living without working outside.

If you want to take a look at what looks like an Assembler, i recommand
you FASM, not SpAsm (i have written SpAsm encoder a very particular way,
that is entirely designed for speed -and this is not at all the regular
way-. In fact -i only see it now-, speed of encodage is zero interrest,
as the main compiling time is eaten by Macros and Equates unfolding, not
by encodage, which takes about no time, anyway).

NASM is very standard and good programation, too, but, halas, written
in C.


Oooopppppssss! I forget:

SpAsm V.3.02b uploaded to day. Good luck everybody!
Posted on 2001-08-22 14:16:58 by Betov
Wow I never thought this could take that long. My estimate was about 3++ years(behaves somewhat like MS-Visual Studio - ctrl j stuff [ something pops up ]... ). But anyway I could tinker a little bit of code and hopefully finish in a decade....hopefully....

Posted on 2001-08-22 15:46:53 by stryker
I tried to create an assembler in highschool. I thought I could do one that creates .COM files quite easily cause .COM files are just pure code, no headers or anything else to deal with, boy was I wrong.
x86 has like over 20 different opcodes for a MOV instructions and you gotta figure out which one to use depending on the parameters. And you also have to interpret the parameters to create MOD r/m and sib bytes. It isn't that difficult if you make your syntax very simple, but to follow the standard conventions it's hard.

Of course, I didn't know anything about syntactical analysis or lexical tokens and what-not back then (I don't really know that shit now) but it aint no walk in the park to make an easy assembler (even a simple one).
Posted on 2001-08-22 17:48:50 by Satrukaan
Did you know that the FORTH Assembler for 80x86 is ONLY 2 pages of text...about 2Kbytes of ASCII?

it also has .IF .ENDIF .REPEAT .WHILE .CASE etc and also ALL HLL FORTH code can be mixed FREE inside ASM code makeing macros a breeze? and so preprocessor is not even required...

well it does not generate a PE ... but its a start...also the syntax is a little biy in reverse :D but that is Forth style
Posted on 2001-08-22 18:00:43 by BogdanOntanu
BogdanOntanu, I'm sure it'd look like Romainian to me, but could you post a link to those two pages of code?
Posted on 2001-08-22 19:55:07 by bitRAKE