I just continue the old Hutch's thread:
http://www.asmcommunity.net/board/showthread.php?threadid=12084&highlight=FileFromPath



OPTION PROLOGUE:NONE ; turn it off
OPTION EPILOGUE:NONE ;
Align 16 ; Align 16 before the proc
FileNameFromPath proc lpPath:DWORD
db 3Eh ; ds: prefix
mov eax, [esp+4] ; eax->lpPath
db 3Eh ; ds: prefix
mov [esp+4], ebx ; saving ebx register
db 3Eh ; ds: prefix
mov ebx, [eax] ;
db 3Eh ; ds: prefix
mov ecx, eax ; eax->lpPath
@LV1: ;
lea edx, [ebx-1010101h]
xor ebx, 5C5C5C5Ch ; 5Ch -> ASCII code of "\"
add ecx, 4 ;
sub ebx, 1010101h ;
or edx, ebx ; testing "\" and 0 simultaneously
db 3Eh ; ds: prefix
mov ebx, [ecx] ; ebx-> next dword
and edx, 80808080h
je @LV1
mov edx, -4
jne @LV4
@LV2:
lea eax, [ecx+edx+1] ; eax->address of next char after "\"
@LV3:
inc edx
je @LV1
@LV4:
cmp byte ptr [ecx+edx], 5Ch ; 5Ch -> ASCII code of "\"
je @LV2
cmp byte ptr [ecx+edx], 0 ; is it end of string?
jne @LV3
mov ebx, [esp+4] ; restore ebx register
ret 4 ;
FileNameFromPath endp ;
OPTION PROLOGUE:PROLOGUEDEF ; turn back on the defaults
OPTION EPILOGUE:EPILOGUEDEF ;


Regards,
Lingo
Posted on 2003-05-26 20:55:30 by lingo12
Why is it so long?
Posted on 2003-05-26 21:30:34 by comrade
What are these "db 3Eh"s for?
Posted on 2003-05-27 00:35:52 by japheth
Change of segment. They are prefixes. But i doubt they are needed. Just my opinion.
Posted on 2003-05-27 03:13:21 by roticv
Two comments:

1. esp is relative to SS even under flat memory model. The parameter loading happens to work because the OS sets DS=ES=SS. But, as roticv mentioned, segment override is not necessary under flat memory model, unless, of course, you have changed DS/ES/SS for some reason.

2. I like the way you put the two test into one. :alright:
<edit>
It seems to me that if you can use either esi or edi, you can avoid the memory access at the tail of the outer loop.
</edit>

Minor point:

jne @LV4 could be jmp @LV4 because the main loop exits only when the previous test does not set ZF.
Posted on 2003-05-27 19:20:49 by Starless
comrade,
"Why is it so long?"
Because there are enough shorter variants in old thread
(for instance Hutch's FileFromPath2 is 27 bytes without the frame)
and also I have a lot of additional bytes 3Eh...just kidding of course...
For you is my shortest variant (19 bytes):


OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
FileNameFromPathS proc lpPath:DWORD
pop eax
pop ecx
push eax
dec ecx
@L1:
mov eax, ecx
@L2:
inc ecx
cmp byte ptr [ecx], 5Ch
je @L1
cmp byte ptr [ecx], 0
jne @L2
inc eax
ret ; 19 bytes
FileNameFromPathS endp
OPTION PROLOGUE:PROLOGUEDEF
OPTION EPILOGUE:EPILOGUEDEF



japheth,
"What are these "db 3Eh"s for?"
If I omit them I'll expect problems with bitRAKE.. just kidding again...

With "db 3Eh" I aligned the loop entry address (@LV1) by 16 to reduce decode clock cycles by manipulating
instruction lengths!

"If you insert an ALIGN 16 directive before the loop entry then the assembler will put in NOP's and other filler instructions up to the nearest 16 byte boundary. Most assemblers use the instruction XCHG EBX,EBX as a 2-byte filler (the so called 2-byte NOP). Whoever got this idea, it's a bad one because this instruction takes more time than two NOP's on most processors! If the loop executes many times then whatever is outside the loop is unimportant in terms of speed and you don't have to care about the suboptimal filler instructions.
But if the time taken by the fillers is important then you may select the filler instructions manually. You may as well use filler instructions that do something useful, such as refreshing a register in order to avoid register read stalls (see chapter 16.2) For example, if you are using register EBP for addressing but seldom write to it, then you may use MOV EBP,EBP or ADD EBP, 0 as filler in order to reduce the possibilities of register read stalls. If you have nothing useful to do, you may use FXCH ST(0) as a good filler because it doesn't put any load on the execution ports, provided that ST(0) contains a valid floating point value.
...
Yet another possibility is to manipulate instruction lengths. Sometimes you can substitute one instruction with another one with a different length. Many instructions can be coded in different versions with different lengths. The assembler always chooses the shortest possible version of an instruction, but it is often possible to hard-code a longer version. For example, DEC ECX is one byte long, SUB ECX,1 is 3 bytes, and you can code a 6 bytes version with a long immediate operand using this trick:
SUB ECX, 9999
ORG $-4
DD 1
Instructions with a memory operand can be made one byte longer with a SIB byte, but the easiest way of making an instruction one byte longer is to add a DS: segment prefix (DB 3Eh). The microprocessors generally accept redundant and meaningless prefixes (except LOCK) as long as the instruction length does not exceed 15 bytes. Even instructions without a memory operand can have a segment prefix. So if you want the DEC ECX instruction to be 2 bytes long, write:
DB 3Eh
DEC ECX " by A.Fog




Starless,
"I like the way you put the two test into one"
Thank you! It was my main goal...

"jne @LV4 could be jmp @LV4 because the main loop
exits only when the previous test does not set ZF."


"22.3. Avoiding jumps (all processors)
There can be many reasons why you may want reduce the number of jumps, calls and returns:
- jump mispredictions are very expensive,
- there are various penalties for consecutive or chained jumps, depending on the processor,
- jump instructions may push one another out of the branch target buffer because of the random
replacement algorithm, a return takes 2 clocks on PPlain and PMMX, calls and returns generate 4 mops on PPro, PII and PIII.
- on PPro, PII and PIII, instruction fetch may be delayed after a jump (chapter 15), and retirement may be slightly less effective
for taken jumps then for other mops (chapter 18).
...
...
And in many cases it is possible to reduce the number of jumps by restructuring your code.
For example, a jump to a jump should be replaced by a jump to the final target.
In some cases this is even possible with conditional jumps if the condition is the same or is known." by A.Fog


Regards,
Lingo
Posted on 2003-05-27 22:45:12 by lingo12
Isn't Agner's note talking about conditional jumps? jmp label has no chance of branch misprecition. If that is mispredicted, probably the CPU is not a good clone of Pentium or later.
Posted on 2003-05-28 00:40:55 by Starless
lingo12, as long as your going to make it that long, I'd unroll the end. :tongue:
(the double test is cool)
Posted on 2003-05-28 09:22:28 by bitRAKE
Something interesting I spotted in the Windows SDK on file names:

Use the backslash (\), the forward slash (/), or both to separate components in a path. No other character is acceptable as a path separator.


Forward slash too? Seems like you all might want to consider changing your path parsing algorithms. ;)
Posted on 2003-06-07 18:35:06 by iblis