Ok, I've started working on my own version of h2incn - a program that will parse C header files and convert them into Nasm compatible .inc files. After creating the initial program outline I am now ready to begin adding in the preprocessing support.

One of the things that I need is a fast way of performing lookups of defines and typedefs. I've settled on using hash maps and the hash function algorithm FNV-1 which, according to the authors, provides for a nice dispersion across a hash map. Thus I begin by defining a hash map to contain the key/value pairs and develop the hashing function.

I want the hashing function and the hash map to be capable of holding arbitrary types and data sizes. Thus the routines being developed do not depend on simple character strings (although you can indeed supply them to the functions). That way, the routines may be used to hold objects of all types for any purpose ( token processing, game world objects, in-memory databases, caching, etc. ).

The following is my Nasm modified version of the FNV-1 hash algorithm for use in both 32-bit and 64-bit systems. Note that the 32-bit version uses the standard C calling convention while the 64-bit version uses either Windows or Linux fastcall calling conventions. It may be optimized further (an exersize left for the reader) but I'm quite happy with it. I would love to know if others find use for it and what their collision findings are...


;
; FNV1HASH.ASM
;
; Copyright (C)2010 Rob Neff
; Source code is licensed under the new/simplified 2-clause BSD OSI license.
;
; This function implements the FNV-1 hash algorithm.
; This source file is formatted for Nasm compatibility although it
; is small enough to be easily converted into another assembler format.
;
; Example C/C++ call:
;
; #ifdef __cplusplus
; extern "C" {
; #endif
;
; unsigned int FNV1Hash(char *buffer, unsigned int len, unsigned int offset_basis);
;
; #ifdef __cplusplus
; }
; #endif
;
; int hash;
;
; /* obtain 32-bit FNV1 hash */
; hash = FNV1Hash(buffer, len, 2166136261);
;
; /* if desired - convert from a 32-bit to 16-bit hash */
; hash = ((hash >> 16) ^ (hash & 0xFFFF));
;



%ifidni __BITS__,32
;
; 32-bit C calling convention
;

%define buffer
%define len
%define offset_basis

global _FNV1Hash

_FNV1Hash:
  push ebp                    ; set up stack frame
  mov  ebp, esp
  push esi                    ; save registers used
  push edi
  push ebx
  push ecx
  push edx

  mov  esi, buffer            ;esi = ptr to buffer
  mov  ecx, len              ;ecx = length of buffer (counter)
  mov  eax, offset_basis      ;set to 2166136261 for FNV-1
  mov  edi, 1000193h          ;FNV_32_PRIME = 16777619
  xor  ebx, ebx              ;ebx = 0
nextbyte:
  mul  edi                    ;eax = eax * FNV_32_PRIME
  mov  bl,               ;bl = byte from esi
  xor  eax, ebx              ;al = al xor bl
  inc  esi                    ;esi = esi + 1 (buffer pos)
  dec  ecx                    ;ecx = ecx - 1 (counter)
  jnz  nextbyte              ;if ecx != 0, jmp to NextByte

  pop  edx                    ; restore registers
  pop  ecx
  pop  ebx
  pop  edi
  pop  esi
  mov  esp, ebp              ; restore stack frame
  pop  ebp
  ret                        ; eax = fnv1 hash

%elifidni __BITS__,64
;
; 64-bit function
;

%ifidni __OUTPUT_FORMAT__,win64
;
; 64-bit Windows fastcall convention:
;    ints/longs/ptrs: RCX, RDX, R8, R9
;    floats/doubles: XMM0 to XMM3
;
global FNV1Hash

FNV1Hash:
  xchg rcx, rdx              ;rcx = length of buffer
  xchg r8, rdx                ;r8 = ptr to buffer

%elifidni __OUTPUT_FORMAT__,elf64
;
; 64-bit Linux fastcall convention
;    ints/longs/ptrs: RDI, RSI, RDX, RCX, R8, R9
;    floats/doubles: XMM0 to XMM7
;
global _FNV1Hash

_FNV1Hash:
  mov  rcx, rsi
  mov  r8, rdi

%endif

  mov  rax, rdx              ;rax = offset_basis - set to 14695981039346656037 for FNV-1
  mov  r9, 100000001B3h      ;r9 = FNV_64_PRIME = 1099511628211
  mov  r10, rbx              ;r10 = saved copy of rbx
  xor  rbx, rbx              ;rbx = 0
nextbyte:
  mul  r9                    ;rax = rax * FNV_64_PRIME
  mov  bl,               ;bl = byte from r8
  xor  rax, rbx              ;al = al xor bl
  inc  r8                    ;inc buffer pos
  dec  rcx                    ;rcx = rcx - 1 (counter)
  jnz  nextbyte              ;if rcx != 0, jmp to nextbyte
  mov  rbx, r10              ;restore rbx
  ret                        ;rax = fnv1 hash

%endif


Note the difference in housekeeping required in 32-bit code compared to it's 64-bit big brother. As a leaf function(a function which makes no other calls) this code becomes incredibly small and fast. The function prologue/epilogue can be made ridiculously easy on 64-bit systems using the fastcall convention.

Optimizing for register pressure is so much simpler since Windows and Linux allow us to trash the R10 and R11 registers along with the registers used for parameters.

In a nutshell: keep your assembly code small, your parameter list within the fastcall convention, create leaf functions if you can (unless calling your own functions), and write portable code.  :)
Posted on 2010-08-21 15:26:31 by p1ranha
Btw, pre-dividing by symbol-length might be nice, too. Below are statistics from a C++ preprocessor parser of mine, parsing the whole <windows.h>  tree:


time0 = 187
Bench results:
hashtab2::find: avg=215/259 min=47 last=47 max=164635, count=181320
hashtab2::add: avg=198/273 min=47 last=152 max=77302, count=17947
MaxSymSize = 70
Sym[1] = 0
Sym[2] = 5
Sym[3] = 7
Sym[4] = 128
Sym[5] = 117
Sym[6] = 167
Sym[7] = 264
Sym[8] = 429
Sym[9] = 539
Sym[10] = 638
Sym[11] = 755
Sym[12] = 871
Sym[13] = 881
Sym[14] = 897
Sym[15] = 818
Sym[16] = 807
Sym[17] = 798
Sym[18] = 856
Sym[19] = 781
Sym[20] = 786
Sym[21] = 789
Sym[22] = 804
Sym[23] = 671
Sym[24] = 579
Sym[25] = 571
Sym[26] = 522
Sym[27] = 471
Sym[28] = 431
Sym[29] = 404
Sym[30] = 349
Sym[31] = 318
Sym[32] = 256
Sym[33] = 217
Sym[34] = 185
Sym[35] = 163
Sym[36] = 98
Sym[37] = 121
Sym[38] = 87
Sym[39] = 67
Sym[40] = 45
Sym[41] = 50
Sym[42] = 35
Sym[43] = 30
Sym[44] = 32
Sym[45] = 19
Sym[46] = 19
Sym[47] = 12
Sym[48] = 12
Sym[49] = 9
Sym[50] = 9
Sym[51] = 6
Sym[52] = 3
Sym[53] = 4
Sym[54] = 5
Sym[55] = 2
Sym[56] = 3
Sym[57] = 0
Sym[58] = 0
Sym[59] = 1
Sym[60] = 0
Sym[61] = 1
Sym[62] = 1
Sym[63] = 0
Sym[64] = 0
Sym[65] = 0
Sym[66] = 0
Sym[67] = 0
Sym[68] = 0
Sym[69] = 1
Sym[70] = 1
Posted on 2010-08-22 06:17:59 by Ultrano
Also see the CrapWow (?) Hash:

http://www.team5150.com/~andrew/noncryptohashzoo/CrapWow.html

with inlined GAS and C implementations.
Posted on 2010-09-01 11:51:44 by Wolfshook