Why do you subtract 1 from each byte in eax???
Also what 'sign' are you checking for and why??? I know i look extremly idiotic while saying this but what exactly are signs in chars? After all this is a character array you are scanning?(or not)?

The stuff about "D0 len=2 p2rEDIwECX 1 mop 1-1-0" im not quite shure about
so I wont try to confuse you with what i think. I was rather confused by the strlen()
myself. But I think I understand it now so ill try my best to explain the above
question.

When you subtract 1 from each byte in eax, any byte in eax wich is 0 will then
become -1(signed). This is used so that we can find out where and/or if we
have found the end of the string. This "and ecx, 80808080h" checks to see
if we have reached the end of the string(check for any -1 bytes in eax). If not
it will check the next dword in the string until it finds the end.

When we have found the end of the string, we need to find its position:
"test eax,000000FFh" here we are trying to findout where the string ends.

Hope this helps!
Posted on 2003-01-22 05:10:41 by natas
natas,
"The stuff about "D0 len=2 p2rEDIwECX 1 mop 1-1-0" im not quite sure about"

Let's read the book together:

"The comments are interpreted as follows: the MOV EAX, instruction is 2 bytes long, it generates one mop for port 2
that reads ESI and writes to (renames) EAX. This information is needed for analyzing the possible bottlenecks.

Let's first analyze the instruction decoding (chapter 14): One of the instructions generates 2 mops (MOV ,EAX).
This instruction must go into decoder D0. There are three decode groups in the loop so it can decode in 3 clock cycles.

For maximum throughput, it is recommended that you order your instructions according to the 4-1-1 pattern:
instructions that generate 2 to 4 mops can be interspearsed with two simple 1-mop instructions for
free, in the sense that they do not add to the decoding time." by A.Fog


The comments are interpreted as follows:


[B]1. 004011A0 8D88FFFEFEFE lea ecx, [eax-1010101h]; D0 len=6 p0rEAXwECX 1 mop 1-1-0[/B]

The lea ecx, [eax-1010101h] instruction is 6 bytes long, it generates one mop for port 0
that reads EAX and writes to (renames) ECX.
This instruction must go into decoder D0.

[B]2. 004011A6 42 inc edx ; D1 len=1 p01rwEDXwF 1 mop[/B]

The inc, edx instruction is 1 byte long, it generates one mop for port 0 or port 1
that reads EDX and writes to (renames) EDX and Flags.
This instruction must go into decoder D1.

In 1. and 2. we use 1-1-0 pattern for generation of two mops
(1st mop from D0 and 2nd mop from D1, we don't use D2),
i.e. the generation of the second mop is for free

[B]3. 004011A7 23 CB and ecx, ebx ; D0 len=2 p01rwECXrEBXwF 1 mop 1-1-1[/B]
The and ecx, ebx instruction is 2 bytes long, it generates one mop for port 0 or port 1
that reads ECX and EBX and writes to (renames) ECX and Flags.
This instruction must go into decoder D0.

[B]4. 004011A9 8B 04 96 mov eax, dword ptr [esi+edx*4]; D1 len=3 p2rESIrEDXIwEAX 1 mop[/B]

The mov eax, dword ptr [esi+edx*4] instruction is 3 bytes long,
it generates one mop for port 2
that reads ESI and EDX and writes to (renames) EAX.
This instruction must go into decoder D1.


[B]5. 004011AC 74 F2 je 004011A0 ; D2 len=2 p1rF 1 mop[/B]

The je 004011A0 instruction is 2 bytes long,
it generates one mop for port 1 that reads Flags.
This instruction must go into decoder D2.

In 3. ,4 and 5. we use 1-1-1 pattern for generation of three mops
(1st mop from D0, 2nd mop from D1 and 3rd mop from D2),
i.e. the generation of the second and the third mops are for free


gladiator,

"...and i cant understand most of agner fog's manual."
Just read it again and again and again...

"..Although i have never used HLA but i do code in HLLs"
What is the difference?

Regards,
Lingo
Posted on 2003-01-22 18:54:12 by lingo12
Thanks Lingo! :alright: Well explained once again! I have not yet
read all of agners book tho. Actually I hate reading books since it all
usually gets dry if its too long. But I think i need to start readin Agners
book now(the time has come :grin: ). Dont suppose you have written
some articles within the assembly language? i would love to read em if
you have? keep it real!
Posted on 2003-01-22 23:49:27 by natas
thanks a lot natas and lingo12, i understand perfectly now.

Btw, how many books there are by agner fog??

I only have one , the 'optimization manual' available on hutch's site.

Where can i find the rest of them?
Posted on 2003-01-23 03:12:53 by clippy
gladiator,

If you don't have them already, download the PIII or PIV manuals directly from the Intel site and make a point of giving them a good read. They are the best that exist and have a depth of detail that is not equalled elsewhere.

Agner Fog's manual is very good and addresses stuff that is not obvious in the Intel manuals so don't give up on it either. Importantly, work out a way to benchmak algos in real time as it will tell you if you code is faster as you optimise it.

Regards,

hutch@movsd.com
Posted on 2003-01-23 03:28:40 by hutch--

str_dir_sep:
call str_end
dec edi
mov al,
cmp al,5Ch
je .ready
mov ax,5Ch
inc edi
mov ,ax
.ready:
ret


Try:


str_dir_sep:
call str_end
cmp byte ptr [edi-1],5Ch
jne @F
dec edi
@@: mov word ptr [edi],5ch
ret
Posted on 2003-03-02 10:22:27 by The Svin