I need a little bit of feedback about some code lines I have written. I want to lock an object from being executed from other threads when a member method was called with xOCall (Exclusive OCall). I added a new variable called dLocked to the object I want to use and access this new member atomically to prevent other threads to interfere.

Here is the core of the code

xOCall macro Cast:req, Args:vararg
    ...
    ...
    push IName
    push eax                                        ;Save registers we need for the method call
    push ecx
    push edx
@Retry:
    mov ecx,                             ;ecx = IName
    xor edx, edx
    xor eax, eax
    inc edx
    lock cmpxchg dword ptr .OName.dLocked, edx ;Lock the object if we have access to it
    je @GoOn                                        ;Object looked?
    invoke SwitchToThread                          ;Make something else in the meantime
    jmp @Retry                                      ;Check eagin if the objects has been unlocked
@GoOn:
    pop edx                                        ;Restore registers
    pop ecx
    pop eax
    OCall Cast, Args                                ;Execute the Object Call
    xchg dword ptr , ecx                      ;Save ecx and move IName into ecx
    mov dword ptr .OName.dLocked, FALSE        ;Unlock the object. This should not be atomic
    pop ecx                                        ;Restore ecx
endm


I tested the code with different threads in different situations and it seems to work. Has somebody an idea when this code can fail?

Regards

Biterider
Posted on 2006-02-28 10:11:41 by Biterider
Looks ok to me, will work for multiple cpu, simple and elegant.
Posted on 2006-03-01 10:17:22 by Homer
It will fail for Console/RDP sessions.

http://www.developerfusion.co.uk/show/1716/8/
He points out that in the Terminal Server edition of NT (which is built into Windows 2000), the kernel no longer has a single "global" namespace, but in fact each Terminal Server session has a private namespace. System services share a common namespace for what is called the "console session". He points out that "this all results in consuming much more memory and making some programming tasks quite tricky, but the result is that every user logged into the Terminal Server is able to start its E-Mail client".

I had to deal with this on executing single instance code on servers with a running console session and RDP sessions running as well.

Regards,  P1  8)
Posted on 2006-03-01 10:58:59 by P1
Hi P1
I read the article of the link you posted. I think we are speaking of different things. The author or the article refers on how to prevent to start multiple instances of an app, while Iím experimenting on how to simplify and make faster the access to an object from multiple threads.
The goal here is to avoid using the typical synchronisation API objects like Critical Sections, Mutex, etc.

Regards,

Biterider
Posted on 2006-03-02 00:54:03 by Biterider
Why not use a critical section? It starts by using a spinloop (which is guaranteed to work on single-cpu as well as SMP, HT, dualcore etc., and iirc is optimized for single vs. multi cpu based on your HAL.DLL), and if it spins too long it'll use a waitable object instead.

So basically a bit more efficient than your current method, and doesn't use the NT-only SwitchToThread...
Posted on 2006-03-02 07:51:58 by f0dder
Hi
Itís more a matter of code design. Imagine you have a huge collection of objects designed to run in a single thread. Now you want to reuse them in a more complex environment where you use multithreading. OK, the idea is to simply use a new calling macro that provided the exclusion capability. If I use a critical section for each instance I have 2 problems: 1. where to store the CS structure and when to initialize it. Since it is a special usage case of the object, I can not initialize the CS in the object initialization (constructor) method. The real drawback you pointed is that SwitchtoThread is only implemented on NT machines and that it doesnít switch to other CPUs. I donít know if there exist implementations for 9x systemsÖ
I tried to follow the code of a CS into the kernel and after long debugging I gave up. I guess that the code I posted should be faster, even if a CS uses a Spinlock (untested).
Perhaps it exist a better approach. Iím open for new ideas!

Regards

Biterider
Posted on 2006-03-02 09:44:23 by Biterider
It seems that Sleep 0 has the same effect than SwitchToThread and this API is available on older systems.

Biterider
Posted on 2006-03-02 10:13:16 by Biterider
Biterider,

It was my understanding COM objects operated in the current namespace and would run into multiple use issues as well, when Remote Desktop Sessions ( Terminal Server ) are running.

You may wish to test this out before dismissing it.

Regards,  P1  8)
Posted on 2006-03-02 10:19:28 by P1
nonononono, DON'T do Sleep(0) - it will not relinquish control to lower-priority threads. Really bad for your health.
Posted on 2006-03-02 10:29:16 by f0dder
Hi P1
You are right. Do you have some asm code I can use to test it?

Hi fOdder
Yes, I see your point. If I locked the object with a low priority thread and I use Sleep(0) from a higher priority thread, I will deadlock the whole app. Really bad for the health. SwitchToThread seems to be the best choice.

Biterider
Posted on 2006-03-02 14:31:18 by Biterider
Hi
I compared my synch method with an equivalent using Critical Sections and found 2 interesting things. First, using CS the overall performance of the threads is aprox. 10% better and second, using CS balances much better the execution of the threads.
Conclusion: back to the keyboard.  :)

Biterider
Posted on 2006-03-03 01:09:51 by Biterider
Hi
I was experimenting a little bit comparing the above code using xOCall and Critical sections. I simply comment out the SwitchToThread line and let the waiting thread looping until the dLocked member is set to FALSE.
I was surprised to see that the results of both methods are completely identical. CPU load, thread execution, thread performance, etc. seems to be absolutely equal.

Regards

Biterider

Posted on 2006-03-04 06:43:54 by Biterider
Lock the code, by SessionID as well.  Note the Local console ID is always Zero.  So any non-zero session is a remote one.
invoke GetCurrentProcessId
invoke ProcessIdToSessionId,eax,addr dwSessionID


This Shell 5.0 code, so nothing under W2K is supported.  AFAIK

So far, I have been using the SessionID to mark the namespace that is being used.

Regards,  P1
Posted on 2006-03-06 09:52:09 by P1

Hi
I was experimenting a little bit comparing the above code using xOCall and Critical sections. I simply comment out the SwitchToThread line and let the waiting thread looping until the dLocked member is set to FALSE.
I was surprised to see that the results of both methods are completely identical. CPU load, thread execution, thread performance, etc. seems to be absolutely equal.

Regards

Biterider


Probably because neither of them have to spin for a long time - ie, the critical section obtains lock before entering a blocking wait. CritSecs still has the advantage, though, that if it has to wait for a longer time, it'll enter a waitstate - your code will keep spinning and use CPU, unless I'm mistaken :)
Posted on 2006-03-07 09:18:06 by f0dder
I think like you and that's why I was surprised when I saw the results.

Biterider
Posted on 2006-03-07 10:34:33 by Biterider
Depending on your application, there is a risk of never releasing the protection if the function throws an exception.
Posted on 2006-03-07 10:42:40 by Dr. Manhattan