Hi Guys,

Anyone knows how to create a code that is able to recognize strings smaller then 04 chars ?

For example, we have an app that in the data section it contains somthing like this:

test01 dd 0120A
test02 db "Hi" , 0
test03 dd " ...some data that can contains strings."

How to make a code (or algo) that can be able to get the data in test01; test02; test03 and analyse it to see if it is either a string, or a data, a pointer etc ?

Do a Table Ascii necessary ?

i meran, a 0123456789ABCDEFGHIJ....abcdef....??????????:

this is for english language ascii table, but in other languages ? Like russian or chinese or japanese ?

How can we separate a string from a dwrod of pointer?

Best Regards,

Guga
Posted on 2003-11-20 08:46:30 by Beyond2000!
Perhaps entropy method.

Scan the code for 'a'-'z' and 'A'-'Z' and '0'-'9'.Then find the null-terminator. If the desired %of the characters are within those range, you get your string. Or perhaps you would want to expand it to include the special characters such as "/" and so on.

I think for other language would be more difficult because other languages (ie unicode) will have a bigger character range (ie that is why they are unicode).
Posted on 2003-11-20 08:55:46 by roticv
Hmmm...

yeah Rotvic...this should solev. Tks.

But, is there a table of the complete AAscii data and the special chars ?

I remember seeing the code you told, but i can?t find it here, or in my files. I guess the scan code is similar to those related to the Hexa data right ?

If you have a part of the code, can you post it here, pls ?

Best Regards,

Guga
Posted on 2003-11-20 09:23:27 by Beyond2000!
this can not be very acurate.


value dd 041414100h

This could be the a string "aaa",0 or a pointer to address 41414100h. Data is whatever a program uses it for if you find a valid ascii string then there is no guarantee that is what it is intended to be used as.
Posted on 2003-11-20 09:42:33 by ENF
Beyond2000!,

I do not have codes with me at the moment. Too much things to do right now :grin: Exams just over and I am enjoying life. By the way one more thing to add, usually for strings, you access them via offset (I hope you do know what I mean).

roticv
Posted on 2003-11-20 10:13:15 by roticv
The Table1 below will actually appear to the compiler(Masm32) as an ascii string. So it must be converted to hex before comparing. This compares strings on a byte by byte process.


.data

Table1 db "004642E8CCFAFFFF8AD8B0FF8A0638D8h:%s",0
bytecount dd 0
Memcount dd 0

.code

Ascii2Hex: ;Subroutine

mov al,
cmp al,39h
ja $+09h
sub al,30h
shl al,04h
jmp $+0dh
test al,020h
jz $+04h
sub al,20h
sub al,37h
shl al,04h
mov ah,al
inc edx
mov al,

cmp al,39h
ja $+06h
sub al,30h
jmp $+0ah
test al,020h
jz $+04h
sub al,20h
sub al,37h
or al,ah
ret



Setbl:

mov bytecount,10h
xor edx,edx
call Ascii2Hex ; Table
mov bl,al

read:

mov al,byte ptr
cmp al,bl
jz equal
inc esi
dec Memcount ; how much memory to scan
jnz read
jmp XXXXX ; string not found in memory specified

equal:

inc esi
dec Memcount
inc edx
dec bytecount
jz found ; String found
call Ascii2Hex
mov bl,al
mov al,byte ptr
cmp al,bl
jz equal
jmp Setbl

found:

Do what you want here. The address will be in "esi". Be sure to subtract the byte count from esi so that esi will point to first byte of string.
Posted on 2003-11-20 15:12:52 by mrgone

Hi Guys,

Anyone knows how to create a code that is able to recognize strings smaller then 04 chars ?

For example, we have an app that in the data section it contains somthing like this:

test01 dd 0120A
test02 db "Hi" , 0
test03 dd " ...some data that can contains strings."

How to make a code (or algo) that can be able to get the data in test01; test02; test03 and analyse it to see if it is either a string, or a data, a pointer etc ?



Analzying data in memory is very difficult. If you're dealing with ASCII characters, particularly zstring data, you can sometimes figure out which is string data and which is binary data. But in general, it is not possible to solve this problem without additional help as some binary data looks just like ASCII data.
Cheers,
Randy Hyde
Posted on 2003-11-20 15:39:14 by rhyde