Hi guys, i?m trying to work with BM algo making it use masks to search for the strings. but i?m having some troubles.

1st - I used the bmh.asm file and fixed a small error (It was not searching a string when it is located on the initial position).

2nd - i added a mask "?" to bypass the wanted char, but it only finds the string when the "?" char is located at the position 0, 1 or in the last byte of the string.

The algo is:


BMHBinsearch proc startpos:DWORD,
                  lpSource:DWORD,srcLngth:DWORD,
                  lpSubStr:DWORD,subLngth:DWORD

  ; ------------------------------------------------------------------
  ; This algorithm is related to a Horspool variation of a Boyer
  ; Moore exact pattern matching algorithm. It only uses the bad char
  ; shift and increments the source if the character is in the table
  ; ------------------------------------------------------------------

    LOCAL cval:DWORD
    LOCAL shift_table[256]:DWORD

    push ebx
    push esi
    push edi

    mov ebx, subLngth

    cmp ebx, 1
    jg @F
    mov eax, -2                 ; string too short, must be > 1
    jmp BMHout
  @@:

    mov esi, lpSource
    add esi, srcLngth
    sub esi, ebx
    mov edx, esi                ; set Exit Length

  ; ----------------------------------------
  ; load shift table with value in subLngth
  ; ----------------------------------------
    mov ecx, 256
    mov eax, ebx
    lea edi, shift_table
    rep stosd

  ; ----------------------------------------------
  ; load decending count values into shift table
  ; ----------------------------------------------
    mov ecx, ebx                ; SubString length in ECX
    dec ecx                     ; correct for zero based index
    mov esi, lpSubStr           ; address of SubString in ESI
    lea edi, shift_table

    xor eax, eax

  Write_Chars:
    mov al,                ; get the character
    inc esi
    mov , ecx        ; write shift for each character
    dec ecx                     ; to ascii location in table
    jnz Write_Chars

  ; -----------------------------
  ; set up for main compare loop
  ; -----------------------------
    mov ecx, ebx
    dec ecx
    mov cval, ecx

    mov esi, lpSource
    mov edi, lpSubStr
    add esi, startpos           ; add starting position

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

  Main_Loop:
    xor eax, eax                ; zero EAX before partial write - xor is faster
    mov al,

    cmp byte ptr, '?'
    je goodmask ; 1st mask. if any char, equal jmp

    cmp al,            ; cmp characters in ESI / EDI
    jne Get_Shift               ; if not equal, get next shift

goodmask:
    dec ecx
    jns Main_Loop

    jmp Matchx

  Get_Shift:
    inc esi                     ; inc esi for minimum shift
    cmp ebx, shift_table ; cmp subLngth to char shift
    jne Exit_Test
    add esi, ecx                ; add bad char shift
  Exit_Test:
    mov ecx, cval               ; reset counter in compare loop
    cmp esi, edx                ; test for exit condition
    jle Main_Loop               ; fixed here !!!

    jmp MisMatch

; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

  Matchx:
    sub esi, lpSource           ; sub source from ESI
    mov eax, esi                ; put length in eax
    jmp BMHout

  MisMatch:
    mov eax, -1

  BMHout:
    pop edi
    pop esi
    pop ebx

    ret

BMHBinsearch endp



The error in the examples are:


call BMHBinsearch 0, gugasource, D@SrcLen, gugateste, D@tmplen

; ok matched
; ; ok matched
; ; ok matched
; ; no
; ; no
; ; no
; ; no
; ;ok matched





i have no idea why it is not finding the correct string when the "?" byte is located anywhere outside the 0,1 or the end of the string pos.

Also, if someone have an faster and secure algo that works with maks, can pls post here ?

and.. the bm.asm is faster then this one ? I didn?t used it, because i found the same problems of finding the initial postiion string and i was unable to fix that.

Best Regards,

Guga
Posted on 2006-12-24 08:17:15 by Beyond2000!