SpooK,

Thanks for the tidy up of this nonsense.

Here is my response under your new guidelines.

Ther are 3 seperate shell style procedures in the MASM32 library. Source code is available as usual at www.masm32.com .

1. shell  DOS style emulation with no process priority changes to run on old or badly set up boxes that crash other types.

2. shell_ex  Priority changing procedure of modern design. May not work on old or damaged boxes.

3. wshell  The wrong way to perform the task but it runs on most later boxes. May not work on old or damaged boxes.

All three (3) algos are documented in the masmlib.hlp file under the category "Misc".

I have writen demonstrations in the past when this topic has been recycled before so I don't intend to waste the time doing it again. Souce code and documentation provided at the above link.

Regards,

hutch at movsd dot com
Posted on 2005-05-06 17:52:30 by hutch--
I have writen demonstrations in the past when this topic has been recycled before so I don't intend to waste the time doing it again. Souce code and documentation provided at the above link.

Ok, thanks for the reply. Could you please point me to the bug demonstration code? I couldn't find it in this or the MASM board, and I'd like to take a look at it. :)
Posted on 2005-05-06 17:56:34 by QvasiModo
QvasiModo,

I don't know where it is any longer, I may have it on one of my machines but it would take a detailed search to find it.

I did the tests this way. With the old shell proc that must run on anything, I ran it in a loop calling a small test piece to demonstrate how much memory was lost with the two unallocated handles. It also demonstrated that the memory is recovered on exit.

The shell_ex proc was tested in the same loop design but without the memory leak test as it deallocates the two handles. It is useful in this type of test to have a number of test pieces that use low, medium and high priority set while the calling app has its priority set to idle. this shows the effects under the useful range of priority settings.

The wshell proc was to shut up the noise, its is a standard WaitForSingleObject() style proc that correctly zeros the STARTUPINFO structure for future compatibility with OS changes but I don't like the design and don't trust it and have documented it that way. It should use the same test design as the shell_ex proc.

As far as the documenation for the old AMD K6-2 3D NOW processor running at 550 meg, it died recently and I replaced it with a Sempron 2.4 to run my printer and scanner but it was a problematic processor for a few reasons, Microsoft had to issue a pach for win95b so it could run it as it had a LOOP instruction that was too fast for the win95 code design.

When I get my digital camera back I wil pull the board out of the junk and take a pretty pic of it for any who don't know what they were.

Regads,

hutch at movsd dot com
Posted on 2005-05-06 18:49:34 by hutch--
Here is some data from Microsoft on the K6-2 processor. The box used to use a 350 K6-2 but it was upgraded to a 550 K6-2 and the win95b problem became critical.

I eventually gave up and ran win98se on it but this processor is the one that crashed EVERY TIME the handles were dealocated.

http://www.microsoft.com/windows95/downloads/contents/WURecommended/S_WUServicePacks/AMDPatch/Default.asp


Windows 95 Update for AMD-K6?-2/350
Download Now

Read Me First
Microsoft? Windows? 95 Update for AMD-K6?-2/350 and Above Patch for Windows 95 OEM SR2 and above fixes a software timing loop that is sensitive to processor frequency and is not a processor erratum. Please note that this patch will not resolve issues associated with any other versions of Windows 95 other than the OEM SR2 version.

To determine the version that you have on your system, please read below. If you are not sure which version of Windows 95 you have, you can find out simply by checking your System Properties. To do this, right mouse click on "My Computer" and select "Properties". An OEM SR2 system will show a designator, such as "4.00.9500 B" as the version. The number may vary slightly, but the letter designator will be a "B" for the OSR2 version. Version designators without a "B", such as an "A" or nothing after the number, cannot utilize this patch.
Posted on 2005-05-07 04:48:53 by hutch--
I did the tests this way. With the old shell proc that must run on anything, I ran it in a loop calling a small test piece to demonstrate how much memory was lost with the two unallocated handles. It also demonstrated that the memory is recovered on exit.


Ok. I know I posted examples of all the issues last time the topic was discussed (about a year ago), but here is another proof of concept as per the new rules ;).

leaktest.exe runs in a loop calling the M32LIB shell function to run client.exe which does nothing but exit. It shows a dot every 256 calls, and on my fresh Win98SE install I can get about 1 and 1/3 lines of dots before either the app crashes, windows hangs, or BSOD.

I know most apps won't call shell so many times, but that's why this is a proof of concept -- it's merely to show that there is a potential problem :).

Edit: I tried using wshell instead, and it worked fine .. completed the 35000 rounds without any problems. It worries me a bit that there is no error checking in either of the three functions :|.
Attachments:
Posted on 2005-05-07 06:33:32 by Jibz

It worries me a bit that there is no error checking in either of the three functions

This *could* be why hutch has had crash with the handle-closing version? I don't know just how badly mannered win95 is, but it wouldn't surprise me that CloseHandle with bogus arguments could cause a detour to BSOD-land - since there's no checking on the return value from CreateProcess, handles could very well be bogus...
Posted on 2005-05-07 06:58:40 by f0dder
Running under Windows 2000 SP4, it took about half a line of dots to use up all of the available 130mb of physical memory, after which CreateProcess fails (dots continue, because of no error checking).

wshell, again, runs with no problems, and no increase in memory usage.
Posted on 2005-05-07 07:32:30 by Jibz
Jibz,

It runs fine on my win2k box, completes the 35k iterations with no problems. I tested it on an installation of win98se and it runs about 10k iterations then GP faults. Diagnostics say access violation in kernel32. I changed the test program to shell_ex with priority set to NORMAL_PRIORITY_CLASS for the called app and it completes the test on both OS versions.

Probably a handle limitation in win98se but its hardly a problem, you don't use the original shell ago for this type of task, it is designed as failsafe on old boxes with the cost being the leaked handles. For simplified high repeat requirements, the shell_ex works fine but any serious app that did something like this would need a "roll your own" to have enough flexibility to do the job.

LATER: After reading your reply, the same effect on win2k but no crash. It takes off about 2 thirds of the way through so it is the same effect. Again probably a handle limit.
Posted on 2005-05-07 07:58:50 by hutch--
I am glad I finally got you to the point where you acknowledge that there is in fact a leak and that it can cause crashes, excessive memory usage and silent failure depending on the OS ;).

If you are still refusing to do anything about it other than point to the two other versions, I would strongly encourage you to add a warning to the documentation of the shell function, like you did with your claimed Win9x problems. Perhaps something like:

Warning: May randomly crash, use up all availabe memory, or silently fail if you call it too many times. Uses CPU time while waiting for program to exit.

And the note at the end of the wshell description is hardly enough .. the note should be with the function that has the problem :D.
Posted on 2005-05-07 08:19:57 by Jibz
Jibz, ran your leaktest (almost identical to mine :)) under Win98. Ends up GP faulting, and repeatedly - you can't terminate the app. Can't restart or shut down the computer, and can't bring up the ctrl-alt-del dialog.

If leaktest is terminated before the GPF'ing, after it has started slugging the computer, there's a multi-minute stall while 9x cleans up - but it *usually* recovers. This is on a 1.7GHz P4 celeron with 256 megs of ram, btw - so the real sluggish performance is not because of crap hardware.

On XP it runs until it exhausts memory (or some process number limit?), then the dots fly rapidly until it terminates gracefully. If, however, I have sysinternals' Process Explorer running, the system crawls to a halt, and doesn't recover until procexp is shut down (which is a difficult thing to do when you can hardly change active window).

For a fun show, try running a bunch of the leaktests at the same time - it also shows how the GetExitCodeProcess polling affects the performance of other tasks.
Posted on 2005-05-07 08:24:21 by f0dder
Jibz,


I am glad I finally got you to the point where you acknowledge that there is in fact a leak


Your memory is failing you, I published a test last round of nonsense that demonstrated the leak. Its hard to finally acknowledge something that I published at least a  year ago.
Posted on 2005-05-07 08:38:07 by hutch--
Noting SpooK' new rule,


For a fun show, try running a bunch of the leaktests at the same time - it also shows how the GetExitCodeProcess polling affects the performance of other tasks.


Run it with shell_ex.
Posted on 2005-05-07 08:49:15 by hutch--

Run it with shell_ex.

Nope - not when the purpose is showing the bad effect of GetExitCodeProcess polling.
Posted on 2005-05-07 08:52:34 by f0dder
Noting SpooK's new rule,


  @@:
    invoke GetExitCodeProcess,pr_info.hProcess,ADDR xc
    invoke Sleep, 0
    cmp xc, STILL_ACTIVE
    je @B


The loop code in shell_ex.
Posted on 2005-05-07 08:56:24 by hutch--
Sleep(0) still exhibits problems with background threads - you need a non-zero argument to "fix" this problem (as I already mentioned and showed code for *before* "SpooK's new rule"). Besides, adding a Sleep is a symptomatic treatment, not a fix.
Posted on 2005-05-07 09:00:57 by f0dder
Here is a test app for showing the CPU usage problem of the shell function.

What I used to test is: rar a -ri1 bison.rar bison-1.31.tar

This runs rar and lowers it's priority while compressing, which is a nice way to run large backups without having to leave the computer alone in the meantime. But it's merely an example to illustrate any task running in the background at a lower priority.

The 3mb file takes roughly 1 second to compress on my box.

cputest.exe uses M32LIB shell to run client.exe, which shows a messagebox. This means the shell GetExitCodeProcess loop will be running until you press the OK button.

With it running, compressing the 3mb file took 1 minute and 55 seconds.

cputestw.exe uses M32LIB wshell instead, but otherwise does the same thing.

With it running, compressing the 3mb file took roughly 1 second.

The loop code in shell_ex.


shell_ex suffers the reverse problem. Try running cputeste.exe and then start rar on compressing some huge file. If you then press OK, it takes a while for the loop to get scheduled. On my box it took about 5 seconds before the loop figured out the client had exited :).

Attachments:
Posted on 2005-05-07 09:12:30 by Jibz
Your memory is failing you, I published a test last round of nonsense that demonstrated the leak. Its hard to finally acknowledge something that I published at least a  year ago.


Yes, you probably did somewhere along the way.

Still, nothing changed about shell or it's description in the docs. I can see you added a note under wshell about the leak, but that's a bit of a strange place to put it :D.
Posted on 2005-05-07 09:19:29 by Jibz
Noting SpooK's new rules,

Here is the objective test of the two algos, shell_ex and wshell.

wshell is slightly faster as it does a bit less but the difference is about 1 - 2 %. Both top out at 100% processor usage during the test. You need to run task manager to see the processor usage. Note that shell_ex sets the priority to NORMAL.
Attachments:
Posted on 2005-05-07 09:27:19 by hutch--
hutch--, your test might be objective, but it fails to show a couple of problems that Jibz perfectly illustrates; going from 1 second to 1min55sec is clearly a performance degradation.
Posted on 2005-05-07 09:34:29 by f0dder
I must admit that it's a very objective test. It nicely measures the time taken to execute either function 5000 times.

Unfortunately, it doesn't address the problem, since there is nothing else running that can be slowed down by, or slow down the loop.
Posted on 2005-05-07 09:37:19 by Jibz