Ok this may seem like a odd question to many, but here goes anyway.

Lets say you want to generate some hard to read code. One way to do it is to write it yourself fully or another way could be to let the assembler generate the code if possible. By hard to read I mean the end result, binary code, meaning that the assembler code written by the user is still fairly easy to read.
In C it doesn't make alot to make the compiler generate alot code. This could be with a macro where it automaticly assigns registeres, variables and stack space for your rutine.

But what about masm? Where would you get the biggest "compiler generating of code"? In asm you usually write it all yourself but it's possible to make expandable macro's and like automatic preserving of registers via USES directive or other ways.

Naturally I'm thinking that the macro engine/ pre processor would give the biggest generation of code but maybe there are other ways.

Examples of what I want is automatic assigning of registers etc. so if you have the same macro twice it doesn't have to use the same twice.
Another thing I've seen as compiled code (generated from a C compiler though) is strange use of arrays.
Imagine you have 2 arrays arr1 and arr2. Then use arr1 and the difference in memory to arr2 to get data out of arr2.
In pseudo language this could look like

mov register, arr1 [ (arr2 - arr1) + <offset into the array>]

<offset into the array> could actually be another register or a fixed number.

I hope any of this makes sense and maybe that someone knows in which type of code you could most easy get the assembler itself generate the code.

// CyberHeg
Posted on 2002-08-13 01:49:56 by CyberHeg
Many people have tried filling a binary file full of junk to make code harder to read but at the disassembly level, it does not make much difference and the guys who are good at it read it just as fast anyway.

If it is a protection technique you are after, I would be inclined to do something a bit smarter than that.

Regards,

hutch@movsd.com
Posted on 2002-08-13 06:45:29 by hutch--
bitRAKE wrote a few BLOAT macros :)

http://www.asmcommunity.net/board/index.php?topic=5601
Posted on 2002-08-13 06:50:36 by bazik
Thanks, I'll look into the bloat macro's.

I agree fully with you hutch for any experienced person it's possible to reverse and understand, the question just is how much time you want to spend on it.

// CyberHeg
Posted on 2002-08-13 12:24:35 by CyberHeg
The problem is that experienced cracker will not be annoyed at all by something generated by the bloat macro... in fact, it will even look suspicious if you bloat at a precise place !
The only thing it would do imho is to bug the end user with a bigger than needed executable...

Delphi makes in itself a good code "obfuscatror"... bit it isn't at all a prevention against cracking...
Posted on 2002-08-13 15:02:37 by JCP
A side note to Readiosys comment is that a preparser can and will do a excelent job. I'm creating one while writing this and just the simple thing as rearranging the order of the instructions and filling gaps with "jumps" to the next instruction (while not messing the flags up or other registers) does a big difference. Ofcourse this alone is a small delay (there is more to it then just jumping around) but my point is that it's easily possible in asm to make a obfuscator which will make the flow much harder to follow.

The reason for writing this thread in first place is that I just wanted to expand my horizons and since you can't read this thing in a book why not ask.

The same goes for randomized assembling. Given N number of subrutines named Sub[0 -> N-1] I'd like to select 1 of them in a random way so every time you assemble the file you get a new compiled state. This would ofcourse require a random generator at macro level (if possible). I think that some of the ideas from the bloat macro's should make this possible although I am not sure.

I'm sorry if this is a little off topic though. I don't want to turn this into a reverse engineering discussion so I'll try to keep my words at source code level :)

// CyberHeg
Posted on 2002-08-13 15:25:57 by CyberHeg
I don't think anybody needs to tell CyberHeg what is effective in fooling crackers and what isn't - there's a good chance he knows better than you (just like me). As everybody who has dealt with software de/protection should know, you can never completely stop attackers - only slow them down. And (good) obfuscation works wonders. The trick is to produce a lot of garbage code, make it look like it's not garbage, mingle the normal code with the garbage, and make program flow nontrivial (jumps, conditional jumps, deeply nested calls, etc). This is a pain to do by hand, and will result in unmanagable source; which is what I believe is the reason for CybH's post here.

Macros would be nice, but I dunno if any assembler (even fasm) has strong enough macro facilities to make the necessary grade of obfuscation possible... a source preprocessor is doable, but would require to write much of the same code that an assembler requires (unless you want to fill your source with helper comments), and perhaps even some basic VM/State Machine. Not a small task.

Another approach would be binary modifications - like z0mbie's mistfall. Quite some impressive work there, even if the mutation/obfuscation samples in his revert4 demonstration program are sort of simplistic; I haven't studied the mistfall engine, only looked VERY briefly at it, so I don't know what the core engine is capable of. But it does seem nice.
Posted on 2002-08-13 16:52:33 by f0dder
A thought just occured while I was reading the thread. Its about lea the wonder
do it all statement. If I remember right it can add, subtract, mult, (divide by mult?),
mov, all sorts of stuff. Would it possibly slow crackers down if you went out of your
way to use just about nothing but lea? The way I see it, it would be a huge pain
to manually go through just about every statement to see what it was actually
doing...
Posted on 2002-08-13 23:34:24 by Graebel
Thanks for the nice words f0dder. I must say that macro's seems to be one way to go but it doesn't seem to be good enough for my needs.

To answer Graebel's comment:

After having alot experience in these area's I've come to some conclusions. One of them is that code manually written by humans can also be read just as easily by humans because even with a large instruction set availble from Intel, people seem to use the same instructions over and over again. Lea is ok but you can't create a whole program with lea's only. Another thing is if you do alot of lea code manually you will as developer also have a hard time understanding the code 1 week after.
The thing I've realized is that people seems to use the same flow in the programs to do specific things. People who will want to reverse the programs knows this and it's just about building an effective "pattern matcher" up in your head. By this I mean that once you've seen a C program disassembled maybe 100 times then when you see a similar program 101'th time you can easily identify what the code does. This is simply because you learn how the compiler generates code.
On the other hand if you obfuscate your code you will soon see that even the most simple program becomes overly complex. Imagine a program with 1000 instructions creating a spiderweb with mixing the real instructions in between each other. To solve this you can't do it by hand anymore. Ofcourse it's not impossible but usually not worth the time just to find out this spiderweb did a string compare or whatever (something that would need around 1000 instructions) which you would normally have recognized in a split second.
Like hutch wrote this won't stop anyone if they are determined and it's not the point about it either but it does slow down people.
To defeat this you will want to make a engine to de-obfuscate your code. I know of such tools which was created in past for the cracking community and the author wrote that his code was like 12000 lines (in C++ I believe).
Why that much code? He had to use parts of a disassembler engine and make a complex code analyzer to see how the code could be simplified.

So back to the conclusion of mine.
Human code can be deprotected by humans while computer generated code can be deprotected by the computer itself.

...And therefore automatic code generation made by a compiler/assembler is worth studying in such a case to see how much you can do without making your own programs.

If you do a search at Google you will even see many java and C# obfuscators. Many of those are comercial and some even pretty expensive.

// CyberHeg
Posted on 2002-08-14 02:07:41 by CyberHeg
Im getting there slowly with ideas such as this for what will end up being my own private PE encrypter :). My plan has only really got as far as making a simple front end and a few idea such as using reloc information to be able to scamble assembled code etc.
Posted on 2002-08-14 03:10:53 by huh
Originally posted by f0dder
I don't think anybody needs to tell CyberHeg what is effective in fooling crackers and what isn't - there's a good chance he knows better than you (just like me).


Oh, is he a regular cracker/reverser ?
If so, sorry for not being a "cracking/reversing scene" regular... and not knowing him... I was just telling my mind after reading the thread, which I think isn't too much out of the purpose of a messageboard...
This board is mainly aimed to programmers, so don't expect to see much experienced crackers there (and if there are, they won't talk about it... "just like you").

And if nobody is *allowed* to tell anything about this because of his presumed lack of cracking/reversing skills (it isn't only for PCs, you know :tongue: ), I don't see the point of this post... :tongue:

As everybody who has dealt with software de/protection should know, you can never completely stop attackers - only slow them down. And (good) obfuscation works wonders. The trick is to produce a lot of garbage code, make it look like it's not garbage, mingle the normal code with the garbage, and make program flow nontrivial (jumps, conditional jumps, deeply nested calls, etc). This is a pain to do by hand, and will result in unmanagable source; which is what I believe is the reason for CybH's post here.


As for my comment, it only was about a #bloat macro that generates bytes at a specified place of the executable... which is imho not an efficient way to hide a routine... I wasn't talking about his idea of the preprocessor, but was answering to the people who talked about the #bloat solution.
Posted on 2002-08-14 03:37:24 by JCP
Interesting post - one of the examples of cross-overs between copy-protection and cr@cking. This seems to be within the allowed bounds of the board, although reaching the borderline.

Hehe, bet you there are more experienced cr@ckers here than you think. Since most of them are very interested in assembly as well, and they do create their own programs - why shouldn't they participate in the discussions here, offering their knowledge and gaining some as well, as long as the topics are within the borders? ;)

And yeah, the #bloat thingy would provide nothing in way of protection ... since no program flow would ever be directed thru it. Obfuscation on the other hand can be rather annoying, until you get your head around it.

Imo, the most annoying thing one can do with obfuscation is to run a line of code just once .... let it modify something on disk or in registry. Typically, while analyzing some hidden code you wanna run thru it more than once, to make sure you got it right. If it's only executed once, you're in trouble ;) This could ofcourse also be done with, say, a jumptable, and a set of code flows that do the same, but look different. I believe something along the lines of this was mentioned earlier. This would confuse quite a few ppl, but would ofcourse also add to the code size.

Fake
Posted on 2002-08-14 06:46:51 by Fake51