is it possible? when you declare data in asm, you use the align - which m- ake memory access much faster (if you align it correctly). what i want to know is WILL it work with opcode? for example, the following produce 5 bytes (odd): mov eax, 10h but if i add in another 3 bytes. such as NOP: mov eax, 10h nop nop nop ;which now becomes 8 bytes does it make any different in term of memory access time??
You need to read some of the Pentium optimization guides, Agner Fog's is best I think, Intel has a PDF one. There are times when an extra NOP does help. Sometimes you can make instructions pair better. There are lots of things to consider, like the length of instructions, various stalls, and so on when you try to optimize code for a Pentium. I look at it this way. Since I'm doing 1-on-1 code, I'm as efficient as possible already. The Pentium has a fair number of pairing combinations, so I'll be using both pipes a "fair" amount, no matter what code I write. There are times, at the "inner most" parts of a program, when it may pay to spend the time, and manually tweak the code. Some routines can see big imvrovements when properly optimized. The routines that are the "core" of program should be opt"ed. But the rest of the program, the one time code, like startup, shutdown, error processing, does it really pay to pick up a few 1/1,000,000,000 of a second here and there? Besides, writing "good code" comes naturally with experience, no matter what language or platform. After a while, you just get a feel for what works best. Once you've already written it the best way possible, it's already optimized, no matter which "processor of the month" it runs on... :)
i agreed. i never attempt to optimize start up code. i know that knowing when/when not to optimize is also a skill. alot of people try to optimize the whole code, which i thought was the best way of doing (back then). now, i look at optimizing in a different way, the way you stated above. thanx for the reply.
I tend to try to optimize the guts of the program for speed. But since any instruction takes longer to load from disk than to run, I go for code-size optimization on the run- once parts of the program.
I am much of the same view as S/390, optimise what matters, write the rest in a tidy and maintainable way and your program will be small where it matters and fast where it matters. Close range opcode choice rarely ever matters because of the structure of 32 bit PE files that in MASM build in 512 byte size increases. Its a reasonable technique to develop a procedure using NOP to pad out parts to improve pairing or to change the code alignment but if and where you find something that does respond to manual alignment, the ALIGN directive does the job fine. ALIGN 4 does the job in most instances and occasionally ALIGN 16 helps. Regards, firstname.lastname@example.org