Hi all

I was trying to create a tool for optmizing ascii to floating point, but running the same code located at different places does not produce same results. Does anyone have a solution? Here is my test project.

KetilO
Posted on 2002-08-12 05:10:20 by KetilO
Again, read what I wrote in the PROFILE source I released yesterday. All your questions are answered there. All docs are written with a certain purpose in mind.

I don't use MASM but if you really want so (??!), write an ALIGN MACRO exploiting $-START (to know how many bytes to fill with NOP's or DB 0's) where you're sure that START is 4096 or 65536 bytes aligned (HISTANCE anyone?).

You'll find WHY you need to do the above in the comments of the PROFILE source I released also yesterday. The comments are there with a reason, believe me, not just to waste bandwidth and development time.
Posted on 2002-08-12 07:34:40 by Maverick
Well, Nexo invented a way to copy the code to the same adress and executing it there. Thougt that should take care of it.

KetilO
Posted on 2002-08-12 07:49:28 by KetilO
To work as expected, PROFILE's requirement must be followed 100.0%, because it's a tricky routine (for a tricky/difficult task).

I don't know the details of Nexo's technique, so I cannot say anything about that, besides what I just wrote above.

I just know that PROFILE never gave me any single problem, and I'm used to stress the hell out of my tools.

But (expecially for such tricky routines/techniques) the docs have to be read and applied very carefully. I'm used to not document what's obvious.
Posted on 2002-08-12 09:02:08 by Maverick
Well, too bad that I cannot make it stable with masm. But I am not going to switch to another assembler because of it.

KetilO
Posted on 2002-08-12 09:17:28 by KetilO
but running the same code located at different places does not produce same results


in your test project you posted above .
it produces the same on my computer

bye

eko
Posted on 2002-08-12 09:58:25 by eko

Well, too bad that I cannot make it stable with masm. But I am not going to switch to another assembler because of it.

KetilO

I fully respect your choice, of course.

Please, only don't put a bad name to PROFILE because of MASM's limitations. That would be unfair.
Posted on 2002-08-12 13:20:39 by Maverick
Hi Maverick

On my home computer (P III Win98 SE) it works as it should. On my computer at work (processor unknown Win XP) it gives unstable results. I don't think masm is the reason.

KetilO
Posted on 2002-08-12 14:53:34 by KetilO

I was trying to create a tool for optmizing ascii to floating point, but running the same code located at different places does not produce same results. Does anyone have a solution? Here is my test project.

If perform test, then different code replace in same location. It is allow create equal conditions for execution. Then you modifing one of test code, other test code shift in address space. Optimize one test code and clocks change for other. It is wrong. It is a good apparently on testa2dw. In one case my algo been very slow and other very fast. The results of this test may look as fake :) Different was only in code location. The PROFILE no resolve this problem.
Therefore sometimes I replace some code in special location for best perfomance. It is adjust of location in specified place for cache lines and branchs. The result of this trick we can observe. Also very important value of pointers in arguments of proc.
The test code is one thing. The woker code is other thing. Nonsense :grin:
It is very intresing area of code optimization.
Posted on 2002-08-12 14:56:01 by Nexo

Hi Maverick

On my home computer (P III Win98 SE) it works as it should. On my computer at work (processor unknown Win XP) it gives unstable results. I don't think masm is the reason.

KetilO


Again KetilO,

Perhaps if you first followed all the rules and requirements you could then say it for real, also for that specific CPU.

The "fscking manual" that you refuse to read states that all routines to be tested, including the empty RET, must be at offset 0 of a 64 bytes alignment. If you don't think there's any reason for it and don't know how tricky cacheline issues are, just fscking apply the requested rule anyway. Thank you, perhaps now it's finally clear.

Please stop (ab)using my tools if you can't even read the docs after 5 times I kindly asked you to do so.

I'll stop this nonsense now though, I'm annoyed, and I don't have time to waste repeating once more what's clearly written in the docs. Computer Science is an exact science, it's not a "what's that ALIGN word? whatever.. who cares???". Hope you finally get it, or stop using PROFILE, even better.

From now on I state that PROFILE works only in FASM and NASM. Do not use with any other assembler, or if you modify any single byte or do not provide all (none excluded) the required alignments. I don't think I'm being more clear than before, but anyway..

Then, if you followed ALL (why do I have to make it bold???) the required rules and it still doesn't perform as it should, report and I will see what's wrong. But I cannot fix what doesn't neither follow the specified, required rules (even if you don't understand how tricky and important they are).. because I know what's already broken.. a wrong implementation. No need to bother more, then.

Clear?
Posted on 2002-08-13 02:17:28 by Maverick
Guys, peace pease :tongue:

After PROFILE release, and the MASM following port it was clearly stated that MASM had problems with align... some fixes has been found I think, but doesn't seem to be enough to make PROFILE work as it should and as it was designed with MASM...
I used the tool a bit with MASM and always had stable results (maybe one time or two a bit more or less than the others values but...). I didn't had the opportunity to test it since I use FASM but as it was designed for NASM/FASM it should work better...

Now, I'm wondering if it would be possible to assemble PROFILE code in a obj and link it with MASM or VC code, for example...
Ketil has a bunch of MASM code and for him the full switch to another assembler is not possible...
FASM is light enough to be always on the disk. If you need to test a procedure, a copy paste of the code and assembly using FASM with the profile code seems an handy solution, even if the final code will be assembled with MASM...

Would you bother to assemble the piece of code you want to test in FASM and test at your PC from work to see if the results are still unstable ? Then we would able to see if MASM is really not the problem (which I don't think).
As Maverick said, computer science is an exact science, and for this reason, I think the only thing he will accept and believe as facts is output given by the tool assembled with the propers assemblers...
Posted on 2002-08-13 07:11:28 by JCP



"fscking manual"



You use too much Linux :)
Posted on 2002-08-13 07:21:17 by bazik
Hi Readiosys, you wrote: Now, I'm wondering if it would be possible to assemble PROFILE code in a obj and link it with MASM or VC code, for example...

Not if the linker doesn't provide the required alignment.

As an alternative, VirtualAlloc one page and copy the PROFILEr code there (respecting the 64 bytes alignment of .empty, and ALL the rest, exactly as provided), and just to be sure VirtualAlloc another page and place the data (PROFILEr's variables) there. Then VirtualAlloc a page for each routine to test. VirtualAlloc ensures 4 KB alignment, which fullfills at least the alignment requirements.

MASM though sometimes doesn't even assemble 1:1 the opcodes.. why bother with Bill's tools? The very little free time I have for the board, I'd rather spend it to support FASM than MASM.

Ketil has a bunch of MASM code and for him the full switch to another assembler is not possible...

I never blamed him, he's free to do what he wants.. as long as he doesn't ignore what I repeated 5+ times, documented, etc.. not without a reason.

I gave you all my PROFILEr because I noticed it was a commonly requested tool. I could have made various specific-CPU versions, or only provided the one that works with my own CPU, but I thought it was a better support to design an as much as possible CPU-independent routine, so that it could work on everybody's development system, and not give you any problem.
If PROFILE is broken for somebody or for some particular CPU I'm glad to fix it and give all of my support. I just ask that little respect that is to really read the docs and applying them before complaining. Otherwise it's only a waste of time for me and for you all, precious time that could be dedicated to something else. Hope I'm not asking too much.. I want to help but one has to read and apply the docs before. Writing docs is not a hobby for me, exactly like it is not a hobby for any of you.

Would you bother to assemble the piece of code you want to test in FASM and test at your PC from work to see if the results are still unstable ?

He has to provide an ALIGN MACRO. If he doesn't have a working one, PROFILE won't run ideally.

Here's my FASM ALIGN MACRO:



; Offset must be > -Alignment and < Alignment. You can omit the Offset parameter (i.e. = 0).
MACRO ALIGN Alignment,Offset {
IF <Offset> EQ <>
REPEAT (Alignment-1)-(($+Alignment-1) mod Alignment)
NOP ; DB 0 if you prefer
END REPEAT
ELSE
REPEAT (Alignment-1)-(($+Alignment-Offset-1) mod Alignment)
NOP ; DB 0 if you prefer
END REPEAT
END IF
}


Posted on 2002-08-13 10:09:39 by Maverick
Originally posted by Maverick
As an alternative, VirtualAlloc one page and copy the PROFILEr code there (respecting the 64 bytes alignment of .empty, and ALL the rest, exactly as provided), and just to be sure VirtualAlloc another page and place the data (PROFILEr's variables) there. Then VirtualAlloc a page for each routine to test. VirtualAlloc ensures 4 KB alignment, which fullfills at least the alignment requirements.

May be easy every test routine (and PROFILE?) replace in own segment? Every segment start with 4Kb aligment.
Posted on 2002-08-13 10:22:40 by Nexo
Originally posted by Maverick
Posted on 2002-08-13 10:36:32 by bazik
bazik :grin:

Nexo: I thought about it.. and I decided that I'll spend the next free time slice I get to make a version of PROFILE that should make everybody happy. Should be here in a couple of days max.
Posted on 2002-08-13 14:51:21 by Maverick
To use larger numbers in the MASM ALIGN directive, SEGMENT directives must be used - use of short-cut segment directives (.code/.data/.data?) cannot be used. The support is in MASM, but very few people use it and it doesn't appear to be the standard method around here. I don't use .MODEL either.

Begin programming with this:
	.586

.MMX
.K3D
.XMM

OPTION CASEMAP:NONE ; case sensitivity
OPTION LANGUAGE:STDCALL ; needed for all the API PROTOs
OPTION DOTNAME ; allow names to start with '.'


; Set Default Segment order and options:

_TEXT SEGMENT READONLY PAGE PUBLIC USE32 'CODE'
_TEXT ENDS

CONST SEGMENT READONLY PUBLIC USE32 'CONST'
CONST ENDS

_DATA SEGMENT PAGE PUBLIC USE32 'DATA'
_DATA ENDS

_BSS SEGMENT PAGE PUBLIC USE32 'BSS'
_BSS ENDS

ASSUME CS:FLAT, DS:FLAT, SS:FLAT, ES:FLAT
...then your can do:
_TEXT SEGMENT

ALIGN 128
_TEXT ENDS
:grin:

Do not use: .DATA .CODE .DATA? .CONST
Posted on 2002-08-13 23:58:03 by bitRAKE
BitRAKE, could you explain USE32 real quick?

I've tried using it, but only get errors. Flat work fine however.

Thanks.
Posted on 2002-08-14 02:06:30 by ThoughtCriminal
Hi bitRAKE

Finally some mature response.
I tried what you suggested but it made no difference (P IV).
It shows ~1400 / ~200. No way I am going to believe that some simple adjustmens of the code can produce such results. On P III it also shows different. Maybe I am doing something wrong, so please anyone, have a look at the code.

To all:
I started this thread hoping for a mature discussion leading to a result, but as often happends on this board someone has to show off and use 'war types' to heat up the discusssion and thereby destroying the possibility of a result. It's a pity. I am here to learn and share my ideas.

KetilO
Posted on 2002-08-14 04:09:04 by KetilO

BitRAKE, could you explain USE32 real quick?

I've tried using it, but only get errors. Flat work fine however.

Thanks.
USE32 has to do with the code generation by the assembler. Instructions can be assembled in two ways for the x86 due to legacy support for 16-bit code. As far as we are concerned, in Windows we use 32-bit code. USE32 tells the assembler that 32-bit registers are the default given mode of the processor. Please post your code that did not work.


KetilO, I do not get the results that you see. Does the example posted produce the problem? I do know that the P4 can be a little tricky (by design), but results should be consistent on the P3. What are the results you are getting?
Posted on 2002-08-14 05:44:52 by bitRAKE