I was thinking about code readability today and thought that one of the biggest obstacles to reading assembly is figuring out what's in the registers at a given point. It would be very cool if you could alias the registers in the same way you assume them. For example if you had a proc that counted bytes you could do this:

ECX ALIAS:cbBytes

EDI ALIAS:pArray

mov pArray,offset MyArray
:
mov al,[pArray]
inc pArray
or al,al
jz >
inc cbBytes
jmp <
:

ECX ALIAS:nothing
EDI ALIAS:nothing


Just a thought, feel free to point out how stupid the idea is :)
Posted on 2004-03-31 00:48:33 by donkey
I think a little improvement can be... use a automatic alias :D, handled by scoopes, but that will be little hard when you will use more than one alias for the same register in the same scope. Interesting problem... the assembler must read your mind ;).



By the way... is that the sintaxis that will use bogdan?
will be nice use jmp (arrow of pow, indicating jump above... this kb is xtrange for me...).. but I not think that exist a counterpart for go down.

Have a nice day or night.
Posted on 2004-03-31 01:09:39 by rea
As far as I understand, and I am not in any way given advance information by Bogdan, his assembler will be MASM/TASM compatible. But that does not mean he can't extend it a bit, with stuff like ALIAS. I am not sure how he would go about scoping the ALIAS but it should be done in the same way as ASSUME. Luckily it is not me who is writing the assembler so I just have to suggest, not worry about how to do it ;)
Posted on 2004-03-31 01:44:00 by donkey
Bogdan's assembler will be very powerfull, it will support 16/32 and 64-bit coding.
Posted on 2004-03-31 03:48:32 by Vortex
Hi Donkey,

In FASM you can do

cbBytes fix ecx

pArray fix edi
Posted on 2004-03-31 07:16:54 by pelaillo
(I know this will probably sound like heresy) - but what about building a register allocator into the assembler? This would be an advantage for those people who do full-assembly coding and don't care to optimize every single routine. I'm not sure how exactly this should be done, perhaps by creating a range of "pseudoregisters", like 16 of them, and have the assembler do register spilling etc.

Of course these regs should have different names, and using the standard eax etc should do verbatim assembling of instructions, as is needed when optimizing things by hand.

For those doing full-asm coding, I don't think the idea of an "optimizing" assembler is bad, as long as it only "optimizes" when you tell it to.

Not like it will matter much to me personally, since I usually only do subroutines in assembly, but it might be worth considering.
Posted on 2004-03-31 08:54:49 by f0dder
Hi f0dder,

Pretty cool idea but maybe a bit too much for me. I geuss you could frame it in a block, like :

USE EDI,ESI,EBX,EAX
LOCAL SomeVAR :DWORD
; Code here
ENDU

It would attempt to replace variables local to the block with the registers specified and restore them on exit. That would handle scope problems.

Hi pelaillo,

I knew about FIX but wasn't aware that the alias was redefinable, a quick check of the manual shows me that it is exactly what I was asking for. Looks like someone thought of it first :)
Posted on 2004-03-31 09:22:57 by donkey
Hm yes, it coudl be done both with "virtual registers" and by aliasing locals to registers. Dunno which would be easiest or most intuitive - it's important that such a feature would only be turned on if you explicitly enable it, though, to avoid confusion.
Posted on 2004-03-31 09:32:22 by f0dder
You can always alias registers by using text macros, at least in MASM. Too bad you can't "undefine" them.

cbBytes TEXTEQU <ECX>

On machines where registers were referenced by number, many programmers would alias them with EQUs.

R0 equ 0
R1 equ 1
cbBytes equ 2
etc.
Posted on 2004-03-31 15:12:03 by tenkey
Hi all,

Interesting ideas, i will give them a try when coding reaches that point.

My first concern is beeing able to assemble old/existing TASM and MASM sourcecode so we will not loose all this code base allready written ;) Honestly my primary target is TASM in no-ideal mode but secondary target is MASM

The only thing i do not like is jz > and i find it much to cryptic and the same sign is used for comparations... i prefer more explicit notation like jz @@forward or something like that.

I completly dislike cryptic and confusing notations...

About register aliasing: well it should not be very hard to implement and FASM's style looks nice (aka using the "fix" keyword) and also MASM's text equ's should be "undefinable"

I do full ASM codeing on big projects and i do not optimize every routine. I only optimize code that i believe it matters... however i never fellt like needing such an feature... keeping track of used/free registers inside a PROC was an prety easy task for my mind...

Anyway i will give it a try when i reach that stage
Posted on 2004-03-31 15:25:42 by BogdanOntanu
Register alias, invoke macro, other cool stuff. Don't you think that in a couple of years moving this way you're gonna invent an absolutely 'new' programming language and call it C.
Posted on 2004-04-01 02:23:43 by Vaxon
I think a little improvement can be... use a automatic alias :D, handled by scoopes, but that will be little hard when you will use more than one alias for the same register in the same scope. Interesting problem... the assembler must read your mind



I think now that I was wrong, I can not post in the past because there where a cut of electric power, then what I was thinking is that, for example, when you use:

ECX ALIAS:cbBytes
EDI ALIAS:pArray
is because you in some moment (in fact in the next instructions) will be using this registers as that, but for a matter of fact, you will need first move the variable inside the registers.... because will not have sense do some like:



[color=red]ECX ALIAS:cbBytes
EDI ALIAS:pArray[/color]

mov pArray,offset [color=red]OnlyANumber[/color]
:
mov al,[pArray]
inc pArray
or al,al
jz >
inc cbBytes
jmp <
:

[color=red]ECX ALIAS:nothing
EDI ALIAS:nothing[/color]


Because you are moving the direction of a single number, and you are aliasing like a pointer to an array... Get the idea?

(I will judge a little :) ... dont care much, is only a point of view )



ECX ALIAS:cbBytes
EDI ALIAS:pArray

mov pArray,offset MyArray
:
mov al,[pArray]
inc pArray
or al,al
jz >
inc cbBytes
jmp <
:

ECX ALIAS:nothing
EDI ALIAS:nothing


mmm... how to explain...??



Think how or when you will change a alias. I think only in movement instructions, xcgh,
maybe when you where zeroising a register you whant to change the alias, But what
I whant to say is that the thing I say "scopes" are defined automaticaly by the programmer,
and the assembler in fact not need read your mind, because when a programmer put:

mov eax, pArray

Is defining that in the next instructions eax have the direction of pArray... and tada...
automatic alias for you ;), the scope and the thing: ecx alias:nothing is not necesary,
because normally a programmer, first move the variable that he/she whant,
and you can use the name of the variable like the alias.

A little thing to point here, is how to diferentiate between the register and the real variable?
maybe adding a single char(say not standar a-z pr A-Z) at start... indicating that you are using
the alias, see that only one alias is in a moment for X register.

Then the code... with the suposition that when you whant to change the alias of a register,
is when you move a variable, or some diferent name of alias, that actually have inside.




xor ECX, ecx?[b]c[/b]Bytes ;Explicit rename or alias for ecx

mov edi,offset MyArray
:
mov al,[#MyArray] ;maybe here is not necesary alias al,
;because you knowthat you are using an alias for the mov.
inc #MyArray
or al,al
jz >
inc (#)cbBytes ; I put (#) becuse is not necesary make
;a diference (the suposition) between a direction and the alias of the register.
jmp <
:


;not need the alias:nothing, because in the next procedire or instructions,
; when you whant handle eax like other alias, you normally will
;move the variable inside..... of the register that you will whant to alias



The important thing here is that the pricipal assumption for the automatic alias, is that in general, you only whant manipulate a register in a diferent way, but only when you move a diferent variable that the one that you have previously.


Get the idea?, what you think, sure that maybe depending on how you implement it, will be easy or more hard... ;)

Have a nice day or night.
Posted on 2004-04-01 08:03:56 by rea

I was thinking about code readability today and thought that one of the biggest obstacles to reading assembly is figuring out what's in the registers at a given point. It would be very cool if you could alias the registers in the same way you assume them. For example if you had a proc that counted bytes you could do this:

ECX ALIAS:cbBytes

EDI ALIAS:pArray

Just a thought, feel free to point out how stupid the idea is :)


If Bogdan's assembler is true to it's TASM roots, you should be able to write:



pArray textequ <byte ptr [edi]>


I use this trick all the time in HLA using text constants, e.g.,



const
attrEBX: text := "(type attr [ebx])";
.
.
.
mov( attrEBX.someAttrField, eax );


The only comment I have, is that it's real easy to lose track of which registers you're using in a procedure if you don't include the register's name as part of the alias (which is why I tend to use names like "attrEBX" in the aliases I create).
Cheers,
Randy Hyde
Posted on 2004-04-13 13:56:27 by rhyde
Originally posted by f0dder (I know this will probably sound like heresy) - but what about building a register allocator into the assembler? This would be an advantage for those people who do full-assembly coding and don't care to optimize every single routine. I'm not sure how exactly this should be done, perhaps by creating a range of "pseudoregisters", like 16 of them, and have the assembler do register spilling etc.

The GCC in-line assembler actually does this sort of thing already.
I've heard of other assembler projects planning on providing register allocation, but I've not seen anything real yet. I'd added optimization and register allocation to the HLA roadmap (around v3.0), but at the rate things are going with HLA v2.0, I wouldn't hold my breath waiting for that feature :-).


Of course these regs should have different names, and using the standard eax etc should do verbatim assembling of instructions, as is needed when optimizing things by hand.

I think the general consensous among people who were having this dicussion a couple of years ago is that you would use something like @eax, @ebx, @ecx, etc., to give the assembler a hint about what registers it *ought* to use, and it could substitute a different register if it was more convenient. A special @reg syntax was chosen when the exact register didn't matter at all (or if you needed more than 6-7 registers in a given code sequence and you wanted the compiler to keep things straight for you).



For those doing full-asm coding, I don't think the idea of an "optimizing" assembler is bad, as long as it only "optimizes" when you tell it to.

or, conversely, stops optimizing when you tell it not to.
For the average person, the default should probably be "optimization on", as most people don't really count the cycles on each and every instruction they write. Of course, you do need the ability to turn optimization on or off on a line by line basis, or the "purists" scream bloody murder (and for those who don't want an optimizer touching their code at all, you simply include a pragma at the beginning of each source file that disables optimization from that point forward).


Not like it will matter much to me personally, since I usually only do subroutines in assembly, but it might be worth considering.

Of course, local optimizations are probably the only kind you'll find in a typical assembler/optimizer. So in that since it would be of interest to you.
Cheers,
Randy Hyde
Posted on 2004-04-13 14:04:37 by rhyde

Register alias, invoke macro, other cool stuff. Don't you think that in a couple of years moving this way you're gonna invent an absolutely 'new' programming language and call it C.


Couple of years? Try early 1990's. MASM (and a little later, TASM) has had most of this stuff since then. Certainly HLA has had this stuff since 1999. And, no, it's not called C.
Cheers,
Randy Hyde
Posted on 2004-04-13 14:07:18 by rhyde
The only comment I have, is that it's real easy to lose track of which registers you're using in a procedure if you don't include the register's name as part of the alias (which is why I tend to use names like "attrEBX" in the aliases I create).



But I think that thing will lost a little the way for use alias.

I think that if really an assembler can use alias and if is usable the 'automatic alias', then in case that you are in dude for what alias is for example #cbBytes (in a first read of code), then you can question to the assembler and the assembler necesarily will answer you ;) (in the out string ;) ), for example:

who is #cbBytes or
?#cbBytes

and the assembler say: ecx is #cbBytes

Also and probabily, and more in the time of construction for test the rules that where aplied for 'automatic alias' (check if they are right), the assembler can give you a list of what alias and in what lines where used for each register.

Also see that I say that normally an alias is used only when you will manipulate (or handle) a register in X way, but normally you only whant handle in a diferent way, because you are moving other value to the register, and in that case, the alias is obtained from the string name. Also there is the sugestion that when is not a movement exist a explicit alias more compact, some like register?alias.




I supose that will be necesary analize some codes, from short to larges ones, at less whit the ones (shorts) that I test on my mind with "simple" rules, the assembler should be able for handle them, by the way, the rules are the one that I mark with blue in the anterior post ;) and the explicit alias.


Also other thing to see is that this rules (that can be extended) aply well for single alias (only one register have a single diferent alias in determinate part of the code), but what happend with:

mov eax, pArray ;eax can be referenced like #pArray
mov edx, pArray ; what happend to this code!!!... a collision?
inc #pArray ;what of the two??, in case that I accept the same name? for diferent registers? (this case is similar to mov edx, #pArray)

A posible answer can be the one that randall say about put a prefix that identify the register.

Also for not put the whole register, you can use some short of eax=a, ecx =c, edx=d, ebx=b, esi=s, edi=d and others... (maybe).

And you can reference it with the normal #pArray only when one of the two (or more registers) are referenced with single diferent alias, and the one that dont lost the previous state will be the one with the alias #pArray.


And push and pop are interesting cases to analize... handle them or noth... in the way that they will change the alias or not?

Have a nice day or night.
Posted on 2004-04-14 09:24:27 by rea