Here it is..

Please, test it and report your CPU (I'm expecially curious about Pentium IV's) and how many cycles does it take on that CPU to execute:

from 1 to 18 NOP's ( + a single RET, of course)

and this routine:

LEA EAX, ; where n is from 1 to 18
.loop:
DEC EAX
JNZ .loop
RET

That is, you should report 36 results.
Posted on 2002-08-23 04:38:33 by Maverick
In the Licence world Have you ever had it this good.... I don't think so...

LICENCE OF USE

I got a lot of past post reading to do ... I don't think it can get much better than this ...
Posted on 2002-08-23 20:56:04 by cmax
Maverick:
On my Ahtalon 1400Mhz I setup your profile routine as described in the txt file (VirtualAlloc for 8k, copy, call offset 559, push address of routine, call offset 1. Routine was at start of my code segment to ensure page alignment). It worked fine for the 1 to 18 nops plus ret. Here are the results: (The last number is the number of cycles for the left number of nops plus ret )
; 0 - 0
nop ;1 - 1
nop ;2 - 1
nop ;3 - 1
nop ;4 - 2
nop ;5 - 2
nop ;6 - 2
nop ;7 - 3
nop ;8 - 3
nop ;9 - 3
nop ;10 - 4
nop ;11 - 4
nop ;12 - 4
nop ;13 - 5
nop ;14 - 5
nop ;15 - 5
nop ;16 - 6
nop ;17 - 6
nop ;18 - 6
ret

However for the second routine of (Under MASM):
lea eax,DWORD PTR
_loop:
dec eax
jnz _loop
ret

The results, while consistent did not seem valid:
n = 1 cycles=-2
n = 2 cyces=39
n = 3 cycles=40
.
.
.
But I didn't test beyond n=3

If needed I have attached the source I used (a bit messy)
Posted on 2002-08-24 02:12:56 by huh
Hi cmax: ? :)

Hi huh: please, could you attach also the EXE? So before looking at the source I can be sure about how it behaves on my machine.
Posted on 2002-08-24 02:25:30 by Maverick
Here:
Posted on 2002-08-25 01:18:56 by huh
Maverick , beleive it or not i am not even finish yet... The reason is that i alway find better ways to do things while at the same time trying to really learn the new stuff about the difference between Masm, fasm and just plain old coding. I got junk everywhere and waiting for my final desioin on how it got to be put back together properly.

It's really weard.... but interesting and it's all i know about ASM, my whole life work since getting involed with it. It's just so many ways to do things and hard to besure of what to go with. Give me just a little more time and i will post it. I am trying to learn PE, and all the stuff that goes with it and i see what you mean by limitations. Most of that stuff is written in Tasm, Nasm and it's hell to try to do it with Masm... at lease for me.... I been making promise for 2 years now but i got to have it at lease 99.9% prefect......FOR ME IT'S JUST TO MANY DESIONS TO MAKE, and have not mastered the first thing yet so i can't call it until i do.

I really want to stick with Masm becauuse i think it is Windows itself that send most of the draw back and if i can do all that Nasm do by learning how to do it with Masm i would be more than happy.... I really want to beat it if POSSIBLE. I don't know nothing else and afraid i might miss something.

See ya soon


PS: Sorry i did not get back sooner. I don't even have a phone now because of a $1200.00 bill....SBC ain't S*it... My bill was 69.00 a month 2500 min but when i moved 15 miles from my service provider (IN THE SAME CITY) SBC was chaging me .10 a minute or more depending on the time of day and i did not realize it. So i say to he*l with THEM i get a cell phone someday...

Merges, and de-regulations is full of it...
Posted on 2002-08-25 19:54:03 by cmax
I should have provided in the archive a Test program.. I didn't do it because it was made with my programming language and I'm in the middle of a "anti reverse engineering" rewrite, so it wouldn't have been protected against curious eyes (of side structures).

With my extremely little free time I thus wrote a FAsm sample application but it doesn't work as expected (while I got a non-releasable Test program perfectly working), so I am trying to understand why my FAsm program doesn't work as expected.

Please understand the time I can dedicate to this is ridicolous at the moment, and I can't help it for now. :(
Posted on 2002-08-26 01:49:13 by Maverick

Please understand the time I can dedicate to this is ridicolous at the moment, and I can't help it for now.


I would not worry about that. The general opinion (I know mine is) is of the appreciation of the time and effot you do stick in for this great tool.

And something else that may help explain things: (From the AMD manual 22007.pdf)

Although rare, do not place critical code at the border between
32-byte aligned code segments and a data segments. Code at
the start or end of a data segment should be executed as
seldomly as possible or simply padded with garbage.


This could be why my test2.exe is not working as im on an athalon processor.
Posted on 2002-08-27 01:08:32 by huh
its "Athlon" my friend :D I have one myself ;)
Posted on 2002-08-29 08:13:16 by x86asm
I just rediscovered profiler 2.
I've set it up and it works for procedures which do not take any paameters.

As soon as I try a procedure which takes parameters then esp becomes unbalanced.
here's how I call it:

for a function without parameters:
push function
call Profiler+1
this wokrs like a charm and it returns the expected results.

for a function without parameters:
push param1
push param2
....
push function
call profiler+1

I've got profile defined in it's own sections which I've padded with NULLS to a total size of 8192 bytes.

the procedures are also in their own unique sections. so on loading, everthing is alinged on a page boundary...

Anyone knows how to solve the problem It's definitely an unbalanced stack and it occurs inside profiler.

thanks.
Posted on 2003-01-18 07:37:23 by MArtial_Code

There's something damn weird with PROFILE v2.0, but I haven't had any time to investigate it, also because I feel it would take a lot of time.

I will explain: I use it *a lot* and everyday, and *never* had *any* single problem or suspect result, it works like charme.. but this happens in my own development tools (a module for my compiler and OS, that is). I then, to share it with You all, wanted to make a version for the various assemblers out there, and here some problem begun. Yes, I noticed some misbehaviours in MASM, etc.. but that don't appear at all in my own internal development environments.
I guess it's a non trivial problem.. which needs some deep investigation. I haven't forgotten it, and I hope to fix it sooner or later (read: better sooner), when the load of work I have on my shoulders will give me the chance.

In the while, if it works for you (I doubt in MASM), you may stick with the first version of PROFILE, it already worked well anyway, the V2 is there for some other reasons (universality of assembler, but maybe some bug was introduced?), but clearly fails in some program environment. Basically, what my compiler/development environment has in particular that may make it behave differently than an usual development environment for assembly language (in this context) is the separation of code and data in two different memory pages. But I thought I had addressed this aspect completely in V2 also for other assemblers.. perhaps some subtle bug got it. I don't use V2, you may have guessed, since I have my own internal version (but it's for my own compiler and OS and thus it's a bit particular).

Anyway.. I should put my hands on it, I know. ;P

I will try to save some free time for it.. but I better not promise anything, my life is a mess right now.
I'm committed to giving a fully working PROFILE V2 to You all though, don't doubt.

PS: I'm sure there's no stack imbalance if used as explained in the docs, please check them carefully, and reports parts of the *docs* you're in doubt with. But, even if so, you report a stack imbalance, I will start investigating from there. Thank you. (maybe an error happened while uploading it :grin: )
Posted on 2003-01-18 08:46:03 by Maverick
Hi Maverick, I understand you are very busy so don't sweat it...
here's the setup(for masm, I'm certain I followed your instructions to the letter.


Each procedure to be profiled is in it own section aligned on a page boundary:
Profiler is in it's own section which is two pages(8192) bytes long
Profiler is loaded at the begining of the section and the rest is zeroed out
Call code at offset 559 only once
push args and/or function to be profiled and call code at offset 1




.586
.model flat, stdcall
option casemap :none

include standard.inc ;
DBGWIN_DEBUG_ON =1;equates for vkims debug macros
DBGWIN_EXT_INFO =0 ;

Main Proto

PROFILERSIZE equ 660
PROFILERINITOFFSET equ 559
PROFILERRESULTOFFSET equ 4096

_PROFILERRESULT struct
lowdword dword 0
hidword dword 0
_PROFILERRESULT ends

externdef ProfilerResult:_PROFILERRESULT
externdef Profiler:proc

Profile MACRO func:req,args:VARARG ;this macro takes care of things
ifndef PROFILERINITED
%echo call Profiler+PROFILERINITOFFSET
call Profiler+PROFILERINITOFFSET ;called only once
PROFILERINITED equ 0
endif
PrintText "Profiling &func"
PrintHex esp,":before profile" ;print esp before args are pushed
ifnb <args>
y TEXTEQU <>
FOR arg,<&args> ;reverse the order of the parameters
y CATSTR <arg>,<!,>,y
ENDM
y SUBSTR y,1,@SizeStr(%y) - 1
%FOR arg,<y>
push arg ;push parameters
%echo push arg
ENDM
endif
push func ;push function address
call [Profiler+1]
PrintHex esp, ":after profile" ;print esp after args are popped
lea eax,[Profiler+PROFILERRESULTOFFSET+0]
mov eax,[eax]
mov ProfilerResult.lowdword,eax
lea eax,[Profiler+PROFILERRESULTOFFSET+4]
mov eax,[eax]
mov ProfilerResult.hidword,eax
ENDM

.data
ProfilerResult _PROFILERRESULT <>
text db "move source string pointer into ecx (cannot do memory indirection fromm emory operand)",0
.data?
BigBuf db 1000 dup (?)
dst dd ?
src dd ?

.code
start:
invoke Main
invoke ExitProcess,0

Main proc
Profile aligned ;profile function
PrintDword ProfilerResult.lowdword,"low dword"
PrintDword ProfilerResult.hidword,"High dword"
PrintLine
Profile RevStr ,offset text,offset BigBuf ;profile function
PrintDword ProfilerResult.lowdword,"low dword"
PrintDword ProfilerResult.hidword,"High dword"
PrintLine
ret
Main endp

_CODE segment dword 'CODE' ;own section
aligned proc
db 18 dup (90H)
ret
aligned endp
_CODE ends

_CODE2 segment dword 'CODE2' ;own section
RevStr proc _src:ptr,_dst:ptr
db 18 dup (90H)
ret
RevStr endp
_CODE2 ends

_PROF1 segment dword 'PROFILER' ;separate section
;Profiler proc ;
Profiler: ;
db 0C3h, 089h, 02Dh, 038h, 000h, 000h, 000h, 0BDh, 000h, 000h, 000h, 000h
db 09Ch, 08Fh, 045h, 03Ch, 089h, 07Dh, 034h, 089h, 075h, 030h, 089h, 055h
db 02Ch, 089h, 04Dh, 028h, 089h, 05Dh, 024h, 089h, 045h, 020h, 08Fh, 045h
db 010h, 08Bh, 045h, 008h, 00Bh, 045h, 00Ch, 075h, 048h, 0C7h, 005h, 0C1h
db 001h, 000h, 000h, 000h, 000h, 000h, 000h, 081h, 02Dh, 0C1h, 001h, 000h
db 000h, 0C5h, 001h, 000h, 000h, 0C7h, 045h, 05Ch, 000h, 000h, 000h, 000h
db 0C7h, 045h, 058h, 000h, 000h, 000h, 000h, 0C7h, 045h, 014h, 05Bh, 000h
db 000h, 000h, 0E9h, 0F6h, 000h, 000h, 000h, 0E8h, 0A8h, 000h, 000h, 000h
db 0FFh, 045h, 058h, 083h, 07Dh, 058h, 010h, 072h, 0E6h, 08Bh, 045h, 000h
db 08Bh, 055h, 004h, 089h, 045h, 008h, 089h, 055h, 00Ch, 058h, 0A3h, 0C1h
db 001h, 000h, 000h, 081h, 02Dh, 0C1h, 001h, 000h, 000h, 0C5h, 001h, 000h
db 000h, 0C7h, 045h, 05Ch, 000h, 000h, 000h, 000h, 0C7h, 045h, 058h, 000h
db 000h, 000h, 000h, 0C7h, 045h, 014h, 09Fh, 000h, 000h, 000h, 0E9h, 0B2h
db 000h, 000h, 000h, 0E8h, 064h, 000h, 000h, 000h, 0FFh, 045h, 058h, 083h
db 07Dh, 058h, 010h, 072h, 0E6h, 08Bh, 085h, 000h, 001h, 000h, 000h, 031h
db 0DBh, 0B9h, 001h, 000h, 000h, 000h, 039h, 084h, 08Dh, 000h, 001h, 000h
db 000h, 076h, 009h, 08Bh, 084h, 08Dh, 000h, 001h, 000h, 000h, 089h, 0CBh
db 041h, 03Bh, 04Dh, 05Ch, 072h, 0E8h, 08Bh, 084h, 0DDh, 080h, 000h, 000h
db 000h, 08Bh, 094h, 0DDh, 084h, 000h, 000h, 000h, 02Bh, 045h, 008h, 01Bh
db 055h, 00Ch, 089h, 045h, 000h, 089h, 055h, 004h, 08Bh, 045h, 040h, 08Bh
db 05Dh, 044h, 08Bh, 04Dh, 048h, 08Bh, 055h, 04Ch, 08Bh, 06Dh, 050h, 0FFh
db 035h, 054h, 000h, 000h, 000h, 09Dh, 0FFh, 025h, 010h, 000h, 000h, 000h
db 08Bh, 045h, 000h, 08Bh, 055h, 004h, 0B9h, 000h, 000h, 000h, 000h, 039h
db 084h, 0CDh, 080h, 000h, 000h, 000h, 075h, 012h, 039h, 094h, 0CDh, 084h
db 000h, 000h, 000h, 075h, 009h, 0FFh, 084h, 08Dh, 000h, 001h, 000h, 000h
db 0EBh, 022h, 041h, 03Bh, 04Dh, 05Ch, 072h, 0DFh, 089h, 084h, 0CDh, 080h
db 000h, 000h, 000h, 089h, 094h, 0CDh, 084h, 000h, 000h, 000h, 0C7h, 084h
db 08Dh, 000h, 001h, 000h, 000h, 001h, 000h, 000h, 000h, 0FFh, 045h, 05Ch
db 0C3h, 08Bh, 07Dh, 034h, 08Bh, 075h, 030h, 08Bh, 045h, 000h, 08Bh, 045h
db 020h, 08Bh, 045h, 040h, 0B9h, 004h, 000h, 000h, 000h, 083h, 0ECh, 020h
db 08Bh, 004h, 024h, 049h, 075h, 0F7h, 081h, 0C4h, 080h, 000h, 000h, 000h
db 0FFh, 075h, 03Ch, 09Dh, 09Bh, 0EBh, 005h, 090h, 090h, 090h, 090h, 090h
db 00Fh, 0A2h, 00Fh, 031h, 089h, 045h, 000h, 089h, 055h, 004h, 08Bh, 055h
db 02Ch, 08Bh, 04Dh, 028h, 08Bh, 05Dh, 024h, 08Bh, 045h, 020h, 08Bh, 06Dh
db 038h, 0C7h, 044h, 024h, 0FCh, 0C0h, 001h, 000h, 000h, 0FFh, 064h, 024h
db 0FCh, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h
db 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h
db 090h, 090h, 090h, 090h, 0E8h, 000h, 000h, 000h, 000h, 0C7h, 044h, 024h
db 0FCh, 000h, 002h, 000h, 000h, 0FFh, 064h, 024h, 0FCh, 090h, 090h, 090h
db 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h
db 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h
db 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h
db 090h, 090h, 090h, 090h, 090h, 090h, 090h, 090h, 089h, 02Dh, 050h, 000h
db 000h, 000h, 0BDh, 000h, 000h, 000h, 000h, 089h, 055h, 04Ch, 089h, 04Dh
db 048h, 089h, 05Dh, 044h, 089h, 045h, 040h, 09Ch, 08Fh, 045h, 054h, 09Bh
db 00Fh, 0A2h, 00Fh, 031h, 02Bh, 045h, 000h, 01Bh, 055h, 004h, 089h, 045h
db 000h, 089h, 055h, 004h, 0FFh, 065h, 014h, 056h, 057h, 0E8h, 05Ah, 000h
db 000h, 000h, 081h, 0E7h, 000h, 0F0h, 0FFh, 0FFh, 08Dh, 0B7h, 000h, 010h
db 000h, 000h, 001h, 07Fh, 02Fh, 001h, 07Fh, 033h, 001h, 07Fh, 039h, 001h
db 07Fh, 03Dh, 001h, 07Fh, 052h, 001h, 07Fh, 077h, 001h, 07Fh, 07Dh, 001h
db 0BFh, 081h, 000h, 000h, 000h, 001h, 0BFh, 096h, 000h, 000h, 000h, 001h
db 0BFh, 09Dh, 001h, 000h, 000h, 001h, 0BFh, 0C9h, 001h, 000h, 000h, 001h
db 077h, 003h, 001h, 077h, 008h, 001h, 0B7h, 0FDh, 000h, 000h, 000h, 001h
db 0B7h, 004h, 001h, 000h, 000h, 001h, 0B7h, 002h, 002h, 000h, 000h, 001h
db 0B7h, 007h, 002h, 000h, 000h, 05Fh, 05Eh, 0C3h, 08Bh, 03Ch, 024h, 0C3h
db 4096*2-PROFILERSIZE dup (0) ;combined size of section is 8192 bytes
_PROF1 ends


this is the output I get:

Profiling aligned
esp = 0012FFA0, :before profile
esp = 0012FFA0, :after profile
ProfilerResult.lowdword = 6, low dword
ProfilerResult.hidword = 0, High dword
-- --------------------------------------
Profiling RevStr
esp = 0012FFA0, :before profile
esp = 00130018, :after profile
ProfilerResult.lowdword = 8, low dword
ProfilerResult.hidword = 0, High dword
----------------------------------------

the two functions are almost identical except RevStr takes two parameters so masm sets up the stack frame etc...

For RevStr the stack isn't balanced it's off by 120 bytes! if it took only 1 parameter the stack would be off by 60 bytes. so there's definitely something systematic going on...

BTW I've been using vkim's debug macros for ages so I know the problem isn't with them, besides they aren't part of the code being profiled.
I never had any luck with the previous version but I'll have a look and see.

Any way all things considered it's a great piec of work.

cheers
Posted on 2003-01-18 12:06:27 by MArtial_Code

Are you sure the unbalance is not caused by your MASM macro?
If PROFILE doesn't cause stack unbalance with no arguments, it can't cause it with arguments. It doesn't see arguments at all, it just provide the upper stack memory locations to the routine to be profiled, whatever they contain. Thus I think the problem may be in your MASM macro. Please, try to test PROFILE with inline code first, it should work (I don't see how it would not cause a stack unbalance with no arguments in your MASM macro, and cause the stack unbalance with arguments in your MASM macro, when PROFILE doesn't even get to know about the difference.. only your MASM macro is aware of the presence of arguments.. for PROFILE it's just memory above ESP, which may have a meaning or not.. PROFILE just passes it to the routine to be profiled, it doesn't touch ESP nor any other register).
Posted on 2003-01-18 17:48:18 by Maverick
Hei Maverick!
The macros he uses are not the problem. I have tested Profiler without
the macros and the result is still the same. The profiler seems to add
the size of each param used in the proc. everytime it calls the proc. its
going to profile. However when the profiler calls the proc. the very first time
it doesnt add anything, but after the first time it all starts.
[color=sienna]90                      nop[color=red]

E800000000 call XXXXXXXX[/color]
C74424FC00000000 mov [esp-04], XXXXXXXX
FF6424FC jmp [esp-04]
90 nop
[/color]

So after each call you see above it continues to increase by the size of params.
In the case above he use DWORDS wich are the size of 4. I have gathered
that profiler runs 15 times? then we can multiply each of the param. size
with 15. and we get DWORD*PARAM*15==120(Wich is the number he mentioned above).

Another thing I noticed was that when I use more then two params. The
profiler doesnt return. Sorry but this is all I have at the moment. Maybe it
can come of some use! :alright:
Posted on 2003-01-19 10:48:24 by natas