Hi, I've been here before but it's a looong time ago, and at that time I started a project called ML(Math Language) it was supposed to make it easier to type formulas etc... in assembly language, but it never really worked, well some weeks ago I had an idea of making my own private language(With compiler) and then expanding it as needed etc... well this might sound crazy but I actually succeded in making the math to assembly convert, well I havn't written the actual code, but it breaks down the math formulas, based upon signs(sign significance: Multiplication over Addition etc...) and it places Brackets first. It works, it is not completly optimized to short up memory use, and to minimize calculations, but it does a few dummy corrections, so everything should run, and an expert assembler can easy see the use of it, and probably can easly adjust it's errors to make it 100% optimized and save time using it etc... but right now it takes a formula like "A=10^2+10", and breaks it down to:


Temp_B=10^2
A=Temp_B+10

where the Temp_B is a temporarly variable called B, this is something the function creates itself based on the ascii codes etc.... As you see, if you try a harder formula like "A=10^2+10-12*62/1^(51*3)" it outputs:


Temp_B=51*3
Temp_C=10^2
Temp_D=1^Temp_B
Temp_E=12*62
Temp_F=Temp_E/Temp_D
Temp_G=Temp_C+10
A=Temp_G-Temp_F

I've included the program in the attachment.

Anyway the problem is now converting these Calculations(NOT MATHEMATICAL EXPRESSIONS) into assembly, and I want to minimize code space(not actual binary space though :wink:), and so I've come up with the following macros:


Addition macro SrcF:DWORD, SrcS:DWORD
XOR EAX, EAX
ADD EAX, SrcF
ADD EAX, SrcS

ret

Addition endm

Subtraction macro SrcF:DWORD, SrcS:DWORD
XOR EAX, EAX
ADD EAX, SrcF
SUB EAX, SrcS

ret

Subtraction endm

Multiplication macro SrcF:DWORD, SrcS:DWORD
XOR EAX, EAX
ADD EAX, SrcF
MUL EAX, SrcS

ret

Multiplication endm

Division macro SrcF:DWORD, SrcS:DWORD
XOR EAX, EAX
ADD EAX, SrcF
DIV EAX, SrcS

ret

Division endm

Now I am positive that this is either not optimized or not working at all, and that is why I've posted it here, because the macros I need are like "A = B + C", "A = B - C", "A = B * C" "A = B / C" etc... So I was wondering if anybody could help me out...
Thanks, and please commet the attached program.

- Julian
Posted on 2004-08-27 12:35:04 by JulianS
This sounds like a fun project. :) i'm sorry i cant help you but i alwasy wanted to do some sort of a math language or lib. But my knowledge about assembler is poor(as my english) . :lol:
Posted on 2004-08-27 17:43:53 by bj1500
You can replace "clear and add" sequence with "load"

Replace
xor eax,eax

add eax,srcF

with
mov eax,srcF


Also, you need to set EDX before dividing.

If you want negative numbers use
cdq

idiv srcS

If you want only positive numbers use
xor edx,edx

div srcS

If you're not using CALL to run the macro code, remove the RET instructions.
Posted on 2004-08-27 18:54:58 by tenkey
Also why not expand it to FPU and SSE, afterall most of the time when dealing with equations you'll want floating point numbers :) .
Posted on 2004-08-27 19:37:30 by Eóin
Matlab is really cool - student copy is $99. :) I'm not trying to pull you away from your coding, but I think it is important to see what is out there. You might want to borrow features for your own work. Matlab actually compiles the code internally to use optimized matrix libraries - impressive to say the least! The interface has some incredible ideas as well, but the language itself is strange = high learning curve. Thinking about problems from a matrix perspective is interesting as well.
Posted on 2004-08-27 19:50:05 by bitRAKE
Octave if a free matlab compatible program. Its not as good as matlab but I use it when working on my college matlab projects at home. Its worth checking out before deciding to part with money.

Oh and the new ms powertoy calculator is really cool too :)
Posted on 2004-08-27 20:03:04 by Eóin
Hi, I'd love to add FPU, but I have no idea as of how to implement it, I tried last time and failed... anyway here is new code:


Addition macro Dst:DWORD, SrcF:DWORD, SrcS:DWORD
MOV EAX, SrcF
ADD EAX, SrcS
MOV Dst, EAX
Addition endm

Subtraction macro Dst:DWORD, SrcF:DWORD, SrcS:DWORD
MOV EAX, SrcF
SUB EAX, SrcS
MOV Dst, EAX
Subtraction endm


Multiplication macro Dst:DWORD, SrcF:DWORD, SrcS:DWORD
MOV EAX, SrcF
MUL EAX, SrcS
MOV Dst, EAX
Multiplication endm

IMultiplication macro Dst:DWORD, SrcF:DWORD, SrcS:DWORD
MOV EAX, SrcF
IMUL EAX, SrcS
MOV Dst, EAX
IMultiplication endm


Division macro Dst:DWORD, SrcF:DWORD, SrcS:DWORD
MOV EAX, SrcF
CDQ
DIV srcS
MOV Dst, EAX
Division endm

IDivision macro Dst:DWORD, SrcF:DWORD, SrcS:DWORD
MOV EAX, SrcF
CDQ
IDIV srcS
MOV Dst, EAX
IDivision endm

Notice that I've included signed and unsigned :), as for optimizing. Though while these functions are great, I want my program code to be as small as possible, so the more macros,and the less comands in each, the better.
In FPU do I just do:


FPUAddition macro Dst:REAL10, SrcF:REAL10, SrcS:REAL10
FpuAdd ScrF, ScrS, Dst
FPUAddition endm

?.

Thanks for your help.

- Julian
Posted on 2004-08-28 04:10:38 by JulianS
FPU is a bit messy casue of the stack, it can be dome easilly enough but to do it optimally would require keeping intermediate results on the stack and keeping track of them, not very nice :o .

An example of the easy way would be to have macros like
FPUAddition macro Dst:REAL10, SrcF:REAL10, SrcS:REAL10

fld SrcF
fadd SrcS
fstp Dst
FPUAddition endm

Where all operands are memory ones.

SSE on the other hand can be done with registers like in the integer MACROs.
Posted on 2004-08-28 07:13:26 by Eóin
So this will work?


FPUAddition macro Dst:REAL10, SrcF:REAL10, SrcS:REAL10
FLD SrcF
FADD SrcS
FSTP Dst
FPUAddition endm

FPUSubtraction macro Dst:REAL10, SrcF:REAL10, SrcS:REAL10
FLD SrcF
FSUB SrcS
FSTP Dst
FPUSubtraction endm

FPUMultiplication macro Dst:REAL10, SrcF:REAL10, SrcS:REAL10
FLD SrcF
FMUL SrcS
FSTP Dst
FPUMultiplication endm

FPUDivision macro Dst:REAL10, SrcF:REAL10, SrcS:REAL10
FLD SrcF
FDIV SrcS
FSTP Dst
FPUDivision endm


- Julian
Posted on 2004-08-28 09:16:01 by JulianS
It certainly should.
Posted on 2004-08-28 10:27:33 by Eóin
Ok, here is new version, please commet any weakness, since I will probably be remaking it a couple of times to get 100% performance 8)

- Julian
Posted on 2004-08-28 11:16:55 by JulianS
Its nice alright, though if you want to be producing optimal code then then I think you'll need to move away from macros and output asm code.
Posted on 2004-08-28 11:56:41 by Eóin
You should get some info about optimizing multiplications and such thihgs, for example, if you the operation is: a = eax*2

I guess the output will be:
shr eax, 2
mov , eax

also how lea is used, but if for example: a = eax+ 3^2+2
will be first "reduced" to a= eax+11

The point is try do the less operations posible, mean that you will have constant and not constant values in a expression.

A constant value that only depend in constants values can be calculated like the second example, and a expression that depend of not constant values like the first, pheraphs can be optimized.

Then I see three diferent point that you should care about the expresion:

1) Check what constanst can be obtained at first and calculate them, you see a easy way for do this, with the things that you have done for now in your application? ;)
2) When you obtain this "reduced" expression, check what expressions can be handled with lea, shifts and pheraphs others (what others?)?
3) handle the rest of the expression in the normal way, that mean doing the rest of the operations without optimization, because cant be optimized more, if is well done the anterior step.


Also for handle fpu and mmx, pheraphs, after you can do the simple thing, can you add some flags to be set in your programm, like a selection choise, and let the user for this options.


About your programm, check that the output you are moving eax to eax, when pheraphs that can be valid for optimization????, is a nop instruction iirc, you can keep the track of the used registers and variables, also pheraphs for what registers should be keeped.


Then my sugestion, understand the macros for multiply, add, ... , in fact understand how this operations take place, example, how many is:
101*11?, simple, follow this rules:

0*1 = 0
1*1= 1

0+0=0
0+1=1
1+1= 10 (carry out)

When you multiply:
12
*32
-----
024
+36?
------
384

Is the same for binary digits
101
*11
------
0101
+1010 ; extra zero, like when multiply decimal numbers
-------
1111


If you do that with bytes, you will see for what the result is the double size than the operands, and for you can have signed and unsigned, following the two complement http://folk.uio.no/botnen/intel/vt/reference/ see cdq and mul.


Also the things about extend to mmx and others, pheraphs is best that depend on what are the variables implicated?, mean that you can enter in your programm declaration for the types, or that you can parse a asm file, read, and watch for the expresions that you whant your programm handle them:

option casemap:none
.....
result dd 0
......
mov eax, ebx
push ecx
= (eax*3)^ebx*sin(ecx-eax)
....

And output the same code, with this line commented, and the iutput of your programm writed down in assembly instructions... ;), is a hard work, I guess. But I see that you will do it.

That is what I sugest.
Posted on 2004-08-28 12:07:20 by rea
Thanks for your suggestions. I'll try doing as many as possible, but about the last one(how to write it in assembly), well it won't be totally like that, it will be like a GUI interface, where you can drag & drop windows, menus, buttons, labels etc... and then when you click one you get to the procedure in assembly, and of course then there are non-windows procedures like "Start" and "Main", and variables will be created differently... I havn't figured out the total design, but it's purpose is to save coding time, and it needs a GUI designer, so much I know.

- Julian
Posted on 2004-08-29 03:07:29 by JulianS
Just a question, if I had the choice bettwen DWORD variables and stack memory(Push/Pop), which would be the best to use? Also are there any limitations on the stack memory beside from your computers memory?

- Julian
Posted on 2004-08-31 09:45:55 by JulianS
Long time no see ;)

Registers are the fastest, local (stack) and static (memory) variables are much slower. Local variables are slightly faster than static ones.

As for the FPU, check this out:
http://en.wikipedia.org/wiki/Reverse_Polish_notation

With the RPN you can convert a formula like this:

((x + (y * z) / w

into this:

x y z * w /

which makes no sense to a human but it's perfect for a computer, since it translates directly into instructions for stack-based machines (like the Intel floating point unit):


fld x ; x
fld y ; y
fld z ; z
fmulp ; *
fld w ; w
fdivp ; /
fstp r ; r could be the result


There's also another advantage in using floats instead of ints: rounding. Suppose you have this formula:

1 / 2 * 2

The result should be 1. But with integer opcodes you round down all the intermediate results, so you get:

1 / 2 * 2 = 0 * 2 = 0

since 0.5 is rounded to 0. Floats are slower but the accuracy is higher -- intermediate results are calculated with decimals, so the above calculation would work.

Hope that helps :)
Posted on 2004-08-31 13:42:33 by QvasiModo
How would you do fast assembly calculations? What's the fastest and most stabile way to do it? I plan on integrating it all, where it tries to use EAX as much, then it will use whatever else is fastest etc... I know this is hard work, but I'd like to gain a performance which is faster then VB's, more stable, and of course only slightly less optimized then the popular C++ compilers. Any ideas are truely appreciated. Ohh, and thanks for the Reversed Polish notation, I'll look at it.

- Julian
Posted on 2004-09-01 05:37:38 by JulianS
If you want to come close to what C++ compilers are doing, then you will need to study existing compiler optimizations. It is an ongoing research area because "there's no such thing as the fastest code" (M. Abrams).

A simple algorithm for improving expression code is the Sethi-Ullman register allocator. You construct a data structure known as the expression tree and use that to select registers before or while creating executable code. It apparently creates the best register allocation for a single expression.

To get better results, you will need to hold several instructions in memory before outputting them to the target file. You will need to perform redundancy elimination on a block of code containing several instructions.

One simple optimization is peephole optimization. It basically is a set of ad hoc optimizations which reduce common instruction sequences with more optimal sequences. You can remove redundant "register loads" within the framework of peephole optimization.

More powerful register allocation and redundancy elimination techniques require you to be comfortable with graph theory, set theory, and complex data structures (linked lists, trees, networks).
Posted on 2004-09-01 15:20:25 by tenkey
Newest version, based on a new principle and way of thought. It is based on that all expressions can be broken down to a tree of calculations, my previous version did this, but without spacing and without reusing memory. Ok, tell me what you think, and please test it's limit, by the way it onyl supports "+" and "-".

- Julian
Posted on 2004-09-02 10:43:54 by JulianS
If any of you know any readings or alike(about Parsing or making your own compiler 8)), then please don't hesitate to tell me. Currently I am going to read:
http://tao.doc.wustl.edu/misc/

Maybe I'll implement more types of math parsers, then when you use the compiler you can chose the one which suits you(Speed, Memory, Readability etc...). Suggestions and comments are appreciated. Thanks again for the community help.

- Julian
Posted on 2004-09-03 14:40:33 by JulianS