Hi all.
What you mean, what is the most proper (and fast) way to check whether two different strings with filenames, points to the one file?

For example - very slow solution is to call GetFullPathName for both strings and to compare results.

Regards.
Posted on 2003-11-26 14:20:25 by JohnFound
Use CreateFile to open both files. Then invoke GetFileInformationByHandle and compare the nFileIndex value in the BY_HANDLE_FILE_INFORMATION struct.
Posted on 2003-11-26 14:30:05 by Delight
Hi Delight.
The problem is that the file(s) may not exists at all and I don't want to create them.
Posted on 2003-11-26 14:41:31 by JohnFound
maybe a routine that compares case-insensitively the two strings, and when a ~ is found, see if the other string (that has no "~" up to this point) is longer than 8 chars (counting from the last "\" ) . If the second string is longer, then we'll make a mark (set one variable, that we're not very sure so far). Later, if all else is exactly the same, we'll have to use the slow method that you mentioned.
But if the two strings have ~ at same offset, then we continue the case-insensitive search.




0123456789ABCDEF
1 c:\someth~1\file1.txt
2 c:\someth~2\file1.txt
3 c:\somsss\file1.txt
4 c:\something\file1.txt
5 c:\something\file2.txt
6 c:\somethi\file1.txt



equality of strings:
1 != 2 , because ~ are at offset 9 at once.
1 or 2 might be equal to 4, so we'll have to use the slow way, but they're not equal to 6, as it is less than 8 chars

.. gotta rush out now, tomorrow I'll try to complete the algo :)
Posted on 2003-11-26 14:43:48 by Ultrano
Hi Ultrano.

This is only one of the possible cases: If the first string is given with short name and second with long names.


But what about another case:

str1 = 'c:\somedir1\somedir2\filename1.xxx'
str2 = '..\filename1.xxx'

Are this two strings points to the same file?
Posted on 2003-11-26 14:52:30 by JohnFound
Then it looks like you would have to go for the slow version.
Posted on 2003-11-26 15:01:29 by Delight
If the second char isn't ":" , then we'll get the current directory, and parse :). Then we'll check the filenames, using the algo I described. We need only once get the current dir.
Posted on 2003-11-26 15:35:05 by Ultrano

This is only one of the possible cases: If the first string is given with short name and second with long names.

No, it covers all the cases. But there will have to be a local var, something like "suspicion:DWORD" , that will make us keep in mind that somewhere we have already detected a possible difference, but we have ignored it.


0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
str1 c:\bakaku~1\subfol~1\fileno~3.txt
str2 c:\bakakun_shogun\subfolder_75\filenomer1.txt

at place 9 we see that in str1 there's a ~, and at place 9+2 there's a "\". We also see in str2 there's no ~, so we mark the comparison as suspicious. Then, in str1 we jump to C, and in str2 to I. If we find the slightest difference in the first 6 chars of each section, then we return NOT_EQUAL immediately.
A "suspicious" is set only when one of the strings contains a ~ at the current place, and the other doesn't at its current place.
Posted on 2003-11-26 15:51:21 by Ultrano
Hi. :)
IMHO, Ultrano's solution does not cover all possible cases. Consider SUBST'ed drives, network drives, UNC paths, Unix-style paths... all of them supported by the APIs.
Posted on 2003-11-26 16:28:00 by QvasiModo
I wonder if you could use the two PIDLs and compare them...

http://www.asmcommunity.net/board/index.php?topic=12729
Posted on 2003-11-26 19:30:49 by donkey
Good idea. :)

I've just been reading about this PIDLs (didn't know them), and they seem to be calculated from the fully-qualified path.

I wonder if it changes for a substed drive... But more likely it has a problem with network drives (how would the system tell?) :confused:

I guess it's a matter of trying it.
(Unless the PIDLs are guaranteed to be unique, I didn't find it in the docs, but I could have missed that).

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/shellcc/platform/shell/programmersguide/shell_basics/namespace.asp
Posted on 2003-11-26 19:49:25 by QvasiModo
Actually it won't work, I just checked. It generates a new PIDL each time. You would have to compare the full PIDL list not just the pointers. Too bad it would have been a good solution, may still be a part of it though.
Posted on 2003-11-26 19:50:55 by donkey
(I just edited my previous post)

Anyway, if the list of IDs is generated from the path, maybe it changes when the path changes too (even for the same file). Or maybe it behaves like the old DOS api call, TRUENAME, and the list is always the same.
Posted on 2003-11-26 19:53:47 by QvasiModo
Actually the PIDL can be thought of as an array of unique identifiers. Each file system object beginning at the desktop is assigned a unique ID and the PIDL is a list of those IDs beginning at the desktop leading directly to a specific file system object, so the list is guaranteed to be unique no matter where the file is located. To work you would have to compare the complete list however, not just the pointers as my GetPIDLFromPath function will just copy the list into a new buffer each time and return that pointer. The contents would be the same but the pointers would be different.

The IShellFolder interface has an IShellFolder::CompareIDs method however and that can be used to compare the PIDLs.
Posted on 2003-11-26 20:08:33 by donkey
hrm, just when would GetFullPathName pose a speed problem?
Posted on 2003-11-26 20:09:32 by f0dder

hrm, just when would GetFullPathName pose a speed problem?

IMHO, the problem is really accuracy, rather than speed.
Posted on 2003-11-26 20:20:08 by QvasiModo
This will do it. You will get a definite speed advantage is you use UNICODE for the paths from the start and skip the conversions, otherwise it is far too slow:

[b][i]; Returns 0 if the paths are equal[/i][/b]

ComparePIDLs proc hWnd:DWORD,pszObject1:DWORD,pszObject2:DWORD
LOCAL pShellFolder :DWORD
LOCAL wsz1[MAX_PATH] :WORD
LOCAL wsz2[MAX_PATH] :WORD
LOCAL Attribs :DWORD
LOCAL Pidl1 :DWORD
LOCAL Pidl2 :DWORD
LOCAL Eaten :DWORD
LOCAL pMalloc :DWORD

invoke SHGetMalloc,ADDR pMalloc

invoke MultiByteToWideChar,CP_ACP,NULL,pszObject1,-1,ADDR wsz1,MAX_PATH
invoke MultiByteToWideChar,CP_ACP,NULL,pszObject2,-1,ADDR wsz2,MAX_PATH
invoke SHGetDesktopFolder,ADDR pShellFolder

lea eax,Attribs
push eax
lea eax,Pidl1
push eax
lea eax,Eaten
push eax
lea eax,wsz1
push eax
push NULL
push hWnd
push pShellFolder
mov edi,pShellFolder
mov edi,[edi]
call DWORD PTR [edi+12] ; IShellFolder::ParseDisplayName

lea eax,Attribs
push eax
lea eax,Pidl2
push eax
lea eax,Eaten
push eax
lea eax,wsz2
push eax
push NULL
push hWnd
push pShellFolder
mov edi,pShellFolder
mov edi,[edi]
call DWORD PTR [edi+12] ; IShellFolder::ParseDisplayName

push Pidl1
push Pidl2
push 0
push pShellFolder
mov edi,pShellFolder
mov edi,[edi]
call DWORD PTR [edi+28] ; IShellFolder::CompareIDs

push eax

push pShellFolder
mov edi,pShellFolder
mov edi,[edi]
call DWORD PTR [edi+8] ; IShellFolder::Release

push Pidl1
push pMalloc
mov edi,pMalloc
mov edi,[edi]
call DWORD PTR [edi+20] ;IMalloc.Free

push Pidl2
push pMalloc
mov edi,pMalloc
mov edi,[edi]
call DWORD PTR [edi+20] ;IMalloc.Free

push pMalloc
mov edi,pMalloc
mov edi,[edi]
call DWORD PTR [edi+8] ; IMalloc:Release

pop eax
ret
ComparePIDLs endp
Posted on 2003-11-26 20:27:06 by donkey


IMHO, the problem is really accuracy, rather than speed.


Well, for me GetFullPathName is acurate enough, but the problem is namely in the speed.

I need this for Fresh compiler (FASM). There are two reasons for that.

1. I want to make Fresh to compile the file from source editor instead from file, if the file is open for editing.
2. For big projects FASM opens/reads some files many times (sometimes hundreds) during compilation, so I want to make file cash and to read files from disk (or editor) only once. This should make compilation faster. I know that windows have cashes too, but they don't work because of point 1.

In both cases I need to compare the filename FASM wants (sometimes with relative path, sometimes absolute) with filenames open in source editor and filenames of the files in the file cash and if the file is already open, just to return the information.

The most clear solution was to make GetFullPathName for every comparation, but when I call it several thousends times, it make compilation even slower than direct reading from the files.

So, now I have some solution: Again with the filename I compute a hash value from full path name (even when the filename is relative). It is fast, because the hash of the filename is computed only once and then I compare only numbers in the search loops.
Actually I wanted to avoid using of hash function and store a value with every name, but unfortunately it is the only solution that I can think of.

Regards.
Posted on 2003-11-27 07:22:14 by JohnFound
Hrm, is what you're saying that fasm re-opens the same file multiple times when assembling it? Like, a standard build of a single file causes this? Wouldn't it be smarter, then, to fix the problem instead of doing symptomatic treatments?
Posted on 2003-11-27 07:41:39 by f0dder

Hrm, is what you're saying that fasm re-opens the same file multiple times when assembling it? Like, a standard build of a single file causes this? Wouldn't it be smarter, then, to fix the problem instead of doing symptomatic treatments?


Well, fasm open multiply times not source files (they are open only once during preprocessing), but the files included in the binary file with directive "file". It opens these files on every assembling pass. Unfortunately big projects need many passes to be compiled (for example Fresh itself needs 63 passes). On other hand for Windows projects there are many files that have to be included this way - actually all types of resources. To change this behaviour will be equal to rewrite big peace of FASM compiler. More, I am not sure is it possible at all. Alse, I don't want to make this because I want to use standard FASM compiler for Fresh. BTW: the hash trick works good for now and Fresh is a bit faster than FASMW.

Regards.
Posted on 2003-11-27 08:07:44 by JohnFound