My assembler, GoAsm, is currently at beta stage but I want to add SSE and SSE2 instructions soon. Also to my debugger GoBug. I wondered if a consensus has yet been reached amongst assembler programmers as to the best syntax to use with these instructions?
In particular, when declaring data in a 128bit hex chunk, is it better to declare DO (meaning "OctoWord", which I believe is favoured by Microsoft and Rene's Spasm) or DX (which I believe is favoured by the authors of NASM) or DQWORD (in FASM) or maybe DS (sixteen bytes)?
Also what is the best syntax for writing the branch hint in SSE2? This adds a prefix byte 2Eh or 3Eh to the jxx opcodes to direct the processor whether or not to cache the destination of the conditional jump. Possibilities are pnt (prediction not taken) or ptk (prediction taken) or bnt or btk (branch taken or not taken) or maybe hintnt or hinttk. Or maybe a modern assembler for the Pentium 4 should automatically add a 2Eh (branch not taken) for all forward jumps and a 3Eh (branch taken) for all backwards jumps (usually the most likely result), which could be overriden using the hint codes.
Grateful for your views.
Posted on 2002-02-04 05:59:20 by jorgon
I use m128 to define data.
And I'm very glad to see you here, Jeremy.
Posted on 2002-02-04 06:58:07 by The Svin
I must have miss-explained, Jeremy. I provided 'O' (OctoWord) for compatibility, but, like NASM developpers, i implented too, and i prefer and also recommand 'X' (Much more readable, and sounds better fitting with XMM, MMX, ... and like 'AnyThing', because the real sizes of targetted Data are not real OctoWords).

Would be great if we could agrea on the 'Colon Dilemna'... but Wayne and i have been unable to agrea on this point. Difficult. I just saw that, in NASM syntax, this point is far from clear and simple too.

Also, to me, the leading point for Local-talking-Labels shoud be better (of course bound to the colon problem).

Posted on 2002-02-04 07:54:36 by Betov

Or maybe a modern assembler for the Pentium 4 should automatically add a 2Eh.

Better make this an option. And have manual prefixes anyway.
I prefer bnt/btk, they have the best ring to them.
Posted on 2002-02-04 08:55:44 by f0dder
... And what do these Prefixes look like with MASM ?

Jeremy, did you ask Frank Kotler for these? (I think they must be implemented in NASM, but i have not seen it).

Posted on 2002-02-04 09:40:49 by Betov
I would say DO, because it follows the convension: db, dw, dd, dq. I wouldn't make it DX. Everyone seems to have a unexplainable fetish for the letter X. It doesn't make sense to use X considering that it doesn't relate to 128 bits or octoword.
Posted on 2002-02-04 10:28:32 by Hel

I would say DO, because it follows the convension: db, dw, dd, dq. I wouldn't make it DX. Everyone seems to have a unexplainable fetish for the letter X. It doesn't make sense to use X considering that it doesn't relate to 128 bits or octoword.
I agree. We should leave X for the next data type. :tongue:
Posted on 2002-02-04 10:37:41 by bitRAKE

I would be inclined to stick with the Microsoft/Intel notation for 128 bit data, the DO makes sense in compatibility terms.

The prefixes should be able to be added by programmer choice so you have the form,

prefix instruction



Makes it programmer adjustable without any problems.

Glad you could make it here.

Posted on 2002-02-05 06:08:49 by hutch--
Hutch, and others, again, what is the form of these Prefixes + Instructions (or PrefixInstructions) with MASM?

Or, as no one answers, have i to understand that they are not implemented in MASM?

Or as no one ever used them, nobody knows?

Posted on 2002-02-05 08:21:20 by Betov
Well thanks for all your answers which I read with interest.
And thanks for the welcome too - it's very nice to see all these friendly faces here!
Posted on 2002-02-06 16:28:05 by jorgon
In a few days from now I'll be publishing GoAsm with support for the SSE/SSE2 instructions.
This will be a beta version for testing for a short time available from the GoAsm forum
followed by a full version available free from the Go Tools website.

The question of what assembler syntax would be best to generate the branch hint bytes 2Eh and 3Eh generated some debate here and elsewhere when this thread was opened. I also had some e-mail correspondence about it with assembler authors and users.

The Intel documentation states that these branch hint bytes "can be used only at the machine code level (that is, there are no mnemonics for the branch hints)". In reality this means that to add these bytes to assembler code you would need to use, for example
DB 2Eh  ;or

DB 3Eh
or alternatively use a macro to achieve the same result.

The Intel IA-64 documentation however, suggests that the assembler instruction (for the more complex branch instructions for that processor) might be
where brp stands for "branch predict instruction", and ipwh and ih vary according to the instruction required.

There is also a "C" code example:-
So I am convinced that we will be using branch hinters in the future, and that it would be useful to have an easy way to use the limited branch hinters available on the P4.
No consensus was actually reached in our debate on this last year and I now need to make a decision for GoAsm.

The P4 uses the branch hint bytes to vary the usual prediction made by the processor as to whether or not a conditional jump will take place. The processor uses this prediction so that instruction can continue from the destination of the jump without delay.

The default prediction is that:-
All forward conditional jumps will not take place, and
All backwards conditional jumps (loops back) will take place.

It is documented that in normal code, backward conditional jumps tend to take place 80% of the time, whereas forward conditional jumps will be likely not to take place. It is said that overall, the default prediction is correct 65% of the time.
So predicting whether or not the conditional jump will take place can speed up the code particularly on a series of loops back.

As an assembler programmer, you can produce fast code knowing the default action since you are in complete control over how you code your loops. But suppose (unusually) you want to create a loop where the conditional jump is at the beginning of a code fragment and the destination of the jump is forward in the code. This loop will run more slowly because of the default predictions. In this case, however, you can add the branch hint byte 3Eh as a prefix to the conditional jump instruction to tell the processor to assume that the jump will occur most of the time.

Using the branch hint byte 2Eh will be useful where you want to use a backwards conditional jump for example in case of an error exit from a procedure. Using 2Eh as a prefix to the conditional jump will stop the processor from assuming that the backwards jump will occur most of the time.

Note that 2Eh and 3Eh used to be used as CS and DS segment override bytes in 16 bit programming, but in 32-bit programming they have this new meaning on later processors.

Since Intel does not recommend any particular mnemonic to insert the branch hint bytes, it is up to assembler and compiler writers to decide what syntax to use.

Various ideas have been put forward by my correspondents. For the sake of
completeness I list them here with my comments:-

LTJ - "likely taken jump"
UTJ - "unlikely taken jump"
OFT - "often taken"
NOFT - "not often taken"
BTK - "branch taken"
BNT - "branch not taken"
PRED_T - "predict taken"
PRED_NT - "predict not taken"
HINT_T - "hint taken"
HINT_NT - "hint not taken"

I think the problem with these is that they may conflict with future mnemonics which may be introduced, or existing labels.
Also they are meaningless. Someone reading your code would need to look in the assembler manual to see what the instruction means.


To my mind these are better because they are more descriptive but I still think they do not actually describe the instruction properly.

After several sleepness nights over this matter I have come up with:-


I believe this is clear because it can be seen that the instruction is to the processor to assume that the conditional jump will branch or will not branch. Also, in anything but 16-bit programming, assume has (hopefully) been abolished and available now for re-use. Even in code which does use assume, the above instruction is unique and meaningful. And the instruction is clearly not a code label (not in GoAsm anyway) since it has no trailing colon. There is some extra typing involved, but in
practice this instruction will be rarely used, so this is not too onerous. If desired the instruction could be replaced by a shorter macro.

It's not too late to persuade me to change my mind about this syntax until the final version of GoAsm supporting SSE and SSE2 is published.
Posted on 2003-09-05 17:09:52 by jorgon
Right... but the meaning of 2Eh and 3Eh have nothing to do with whether it's running 32-bit or 16-bit code. It depends on whether the following instruction is a conditional jump instruction or not. They still work as expected with memory referencing instructions.
Posted on 2003-09-05 17:37:00 by Sephiroth3
LTJ - "likely taken jump"
UTJ - "unlikely taken jump"

Sounds like it best represents the programmers thinking process to me. Since Intel has left the name undefined, how about using a short from and future proofing with a long form?
Posted on 2003-09-06 00:52:51 by ThoughtCriminal
I think the following sounds the most technically appropriate.

HINT_T - "hint taken"
HINT_NT - "hint not taken"
Posted on 2003-09-06 05:33:29 by roticv
I like a hybrid of jorgons & roticvs:

Posted on 2003-09-06 07:40:23 by Eóin
Are these "hint" mechanisms present on Athlon cpu's?
Posted on 2003-09-06 08:47:38 by gfalen
Thanks all, for your comments.
I may well take up your proposed syntax, E?in: I'll see how it looks today.
Posted on 2003-09-10 02:25:10 by jorgon
The GoAsm beta with SSE/SSE2 support is now available directly from here
(217K download including the latest help file). Or visit the Go tool website
Posted on 2003-09-11 17:30:40 by jorgon